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Abstract 

Let M = {E,X) be a matroid and let S = {Si, . . . ,St} be a family of subsets of E of 

size p. A subfamily 5 C 5 is q-representative for S if for every set y C _E of size at most q, 

^T^ if there is a set X £ S disjoint from Y with X UY G X, then there is a set X G S disjoint 

y—( from Y with X U Y E I. By the classical result of BoUobas, in a uniform matroid, every 

^^ family of sets of size p has a q-representative family with at most (^"'"'^^ sets. In his famous 

"two families theorem" from 1977, Lovasz proved that the same bound also holds for any 

matroid representable over a field F. As observed by Marx, Lovasz's proof is constructive. 

In this paper we show how Lovasz's proof can be turned into an algorithm constructing a 

(/-representative family of size at most (^"'"''j in time bounded by a polynomial in (p^'^j ^ t, 

On and the time required for field operations. 

^"^ We demonstrate how the efficient construction of representative families can be a powerful 

^ tool for designing single-exponential parameterized and exact exponential time algorithms. 

^Q The applications of our approach include the following. 

\-^ • In the Long Directed Cycle problem the input is a directed n-vertex graph G and 

C/3 the positive integer k. The task is to find a directed cycle of length at least k in G, if 

, ^, such a cycle exists. As a consequence of our S''^°^''^n'^^^^ time algorithm, we have that 

a directed cycle of length at least log n, if such cycle exists, can be found in polynomial 
CN time. As it was shown by Bjorklund, Husfeldt, and Khanna [ICALP 2004], under an 

^ appropriate complexity assumption, it is impossible to improve this guarantee by more 

than a constant factor. Thus our algorithm not only improves over the best previous 
^Q log n/ log log n bound of Gabow and Nie [SODA 2004] but also closes the gap between 

^+ known lower and upper bounds for this problem. 

■^ • In the Minimum Equivalent Graph (MEG) problem we are seeking a spanning 

^D subdigraph D' of a given n-vertex digraph D with as few arcs as possible in which 

^^ the reachability relation is the same as in the original digraph D. The existence of a 

. . single-exponential c"-time algorithm for some constant c > 1 for MEG was open since 

> the work of Moyles and Thompson [J ACM 1969]. 

Si^ • To demonstrate the diversity of applications of the approach, we provide an alterna- 

Jh five proof of the results recently obtained by Bodlaender, Cygan, Kratsch and Nederlof 

for algorithms on graphs of bounded treewidth, who showed that many "connectivity" 
problems such as Hamiltonian Cycle or Steiner Tree can be solved in time 2^(*)n 
on n-vertex graphs of treewidth at most t. We believe that expressing graph prob- 
lems in "matroid language" shed light on what makes it possible to solve connectivity 
problems single-exponential time parameterized by treewidth. 

For the special case of uniform matroids on n elements, we give a faster algorithm 
computing a representative family in time ©((^i^)'' ■2°^p~^'^'> -t-logn). We use this algorithm 
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to provide the fastest known deterministic parameterized algorithms for A:-Path, fc-TREE, 
and more generally, for /c-SuBGRAPH ISOMORPHISM, where the /c-vertex pattern graph is of 
constant treewidth. For example, our fc-PATH algorithm runs in time 0(2. 851*^71 log n). 
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1 Introduction 

The theory of matroids provides a deep insight on the tractability of a number of fundamental 
problems in Combinatorial Optimizations hke Minimum Weight Spanning Tree or Perfect 
Matching. Marx in [251 was the first to apply matroids to design fixed-parameter tractable al- 
gorithms. The main tool used by Marx was the notion of representative families. Representative 
families for set systems were introduced by Monien in |35| . 

Let M = {E,I) be a matroid and let S = {Si, . . . , 5*} be a family of subsets of E of size p. 
A subfamily 5 C 5 is q-representative for S if for every set Y <Z E oi size at most q, if there is 
a set X £ S disjoint from Y with X L)Y £ I, then there is a set X € S disjoint from Y and 
X L)Y £ I. In other words, if a set Y of size at most q can be extended to an independent set 
of size |y| + p by adding a subset from S, then it also can be extended to an independent set 
of size |y| + p by adding a subset from S as well. 



The Two- Families Theorem of Bollobas [S] for extremal set systems and its generalization to 
subspaces of a vector space of Lovasz [29] (see also [IQj ) imply that every family of sets of size p 
has a ^-representative family with at most (P"'"'') sets. These theorems are the corner-stones in 
extremal set theory with numerous applications in graph and hypergraph theory, combinatorial 
geometry and theoretical computer science. We refer to Section 9.2.2 of [23 , surveys of Tuza 
|42t H5] . and Gil Kalai's bloa^for more information on the theorems and their applications. 

For set families, or equivalently for uniform matroids, Monien provided an algorithm com- 
puting a g-representative family of size at most Y11=qP^ ™ time 0{jpq ■ Ylt^oP^ ' l^SJ. Marx in 
j32j provided another algorithm, also for uniform matroids, for finding (/-representative families 
of size at most (^"'"'^) in time 0{p'^ ■ t^). For linear matroids, Marx [33j has shown how Lovasz's 
proof can be transformed into an algorithm computing a g-representative family. However, the 
running time of the algorithm given in |33| is f {p, q){\\-A^M\\t)'~'^^' ■, where f{p,q) is a polynomial 
in {p + q)P and {^p), that is, f{p,q) = 2^(pi°s(P+9)) . (P+'?)^^^\ and Am is the matroid's repre- 
sentation matrix. Thus, when p is a constant, which is the way this lemma has been recently 
used in the kernelization algorithms [28^, we have that f{p,q) = (p + qy^^'. However, for 
unbounded p (for an example when p = q = 2) the running time of this algorithm is bounded 

Our results. We give two faster algorithms computing representative families and show how 
they can be used to obtain improved parameterized and exact exponential algorithms for several 
fundamental and well studied problems. 
Our first result is the following 

Theorem 1. Let M = {E,I) be a linear matroid of rank p + q = k given together with its 
representation matrix Am over a field F. Let S = {Si, . . . , St} be a family of independent sets 
of size p. Then a q-representative family 5 C 5 for S with at most (p^*?) sets can be found 

in O {{^V^)tp'^ + ^(^1^*^) ) operations over F. Here, u < 2.373 is the matrix multiplication 
exponent. 

Actually, we will prove a more general variant of Theorem [T] which allows sets to have 
weights. This extension will be used in several applications. This theorem uses the notion of 
weighted representative families and computes a weighted g-representative family of size at most 
(^D ) "within the running time claimed in Theorem 1 The proof of Theorem 1 relies on the 
exterior algebra based proof of Lovasz [2U] and exploits the multi-linearity of the determinant 
function. 

For the case of uniform matroids, we provide the following theorem 

Theorem 2. Let S = {Si, . . . , St} be a family of sets of size p over a universe of size n. For 
a given q, a q-representative family S '^ S for S with at most {^^'^^ ■ 2°(p+'?) • log n sets can be 

computed in time 0{{^^^Y ■ 2°(p+'?) • t • logn). 

The proof of Theorem [2] is essentially an algorithmic variant of the "random permutation" 
proof of Bollobas Lemma (see |23| Theorem 8.7]). A slightly weaker variant of Bollobas Lemma 
can be proved using random partitions instead of random permutations, the advantage of the 
random partitions proof being that it can be de-randomized using efficient constructions of 
universal sets [38]. To obtain our results we define separating collections and give efficient 
constructions of them. 

Separating collections can be seen as a variant of universal sets. A n-p-q- separating collection 
C is a pair {T, x) , where J-" is a family of sets over a universe U of size n and x is a function 
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from (t) to 2-^ such that the following two properties are satisfied; (a) for every j4 S ( ) and 







F £ x(^), AC F, (b) for every A £ d) and B £ 



\p) 



(U\A^ 



, there is an F € x{A) such that A C F 



and F D B = $. The size of {J-, x) is |-^|, whereas the max degree of {J-, x) is max 



rS\ 



M'i) 



\X{A)\. 



Here 2"^ for a set S is the family of all subsets of S while ( ) is the family of all subsets of S of 
size p. 

An efficient construction of separating collections is an algorithm that given n, p and q 
outputs the family J^ of a separating collection (J^, x) and then allows queries x{A) for A G ( ). 
We give constructions of separating collections of optimal (up to subexponential factors in p+q) 
size and degree, and construction and query time which is linear (up to subexponential factors 
in p + q) in the size of the output. 

Applications. Here we provide the list of main applications that can be derived from our 
algorithms that compute representative families together with a short overview of previous 
work on each application. 



Reference 


Randomized 


Deterministic 


Monien [3S] 


- 


0{k\nm) 


Bodlaender [B] 


- 


0{kl2^n) 


Alon et al. P] 


0(5.44*=^) 


0{c^nlogn) for a large c 


Kneis at al. J2S| 


0*{4'') 


0*(16^) 


Chen et al. [Tn| 


0{4^e-^m) 


4fc+o(iog3fc)^^ 


Koutis |2()| 


0*{2.83^) 


- 


Williams |44) 


0*{2^) 


- 


Bjorklund et al. [1] 


0*(1.66^) 


- 


This paper 


- 


0(2.85l'=nlog2n) 



Table 1: Results for fc-PATH. We use 0*() notation that hides factors polynomial in the 
number of vertices n and the parameter k in cases when the authors do not specify the power 
of polynomials. 



fc-Path. In the /c-Path problem we are given an undirected n-vertex graph G and integer k. 
The question is if G contains a path of length k. A:-Path was studied intensively within the 
parameterized complexity paradigm [T2] . For n-vertex graphs the problem is trivially solvable in 
time 0{n ). Monien [35] and Bodlaender showed that the problem is fixed parameter tractable. 
Monien used representative families for set systems for his /c-Path algorithm [SS] and Plehn 
and Voigt extended this algorithm to Subgraph Isomorphism in [41 . This led Papadimitriou 
and Yannakakis jlD] to conjecture that the problem is solvable in polynomial time for k = log n. 
This conjecture was resolved in a seminal paper of Alon et al. [T], who introduced the method 
of color-coding and obtained the first single exponential algorithm for the problem. Actually, 
the method of Alon et al. can be applied for more general problems, like finding a /c-path in 
directed graphs, or to solve the Subgraph Isomorphism problem in time 2^'^'^'^rP'^^\ when the 
treewidth of the pattern graph is bounded by t. There has been a lot of efforts in parameterized 
algorithms to reduce the base of the exponent of both deterministic as well as the randomized 
algorithms for the /c-Path problem, see Table [1} After the work of Alon et al. [I] , there were 
several breakthrough ideas leading to faster and faster randomized algorithms. Concerning 
deterministic algorithms, no improvements occurred since 2007, when Chen et al. [TP showed 
a clever way of applying universal sets to reduce the running time of color-coding algorithm to 

Q*(Ak+o{k)\ 



A;-Path is a special case of the A;-Subgraph Isomorphism problem, where for given n- vertex 
graph G and fc-vertex graph F, the question is whether G contains a subgraph isomorphic to 
F. In addition to /c-Path, parameterized algorithms for two other variants of /c-Subgraph 
Isomorphism, when F is a tree, and more generally, a graph of treewidth at most t, were 
studied in the Hterature. Alon et al. pTj showed that A;-Subgraph Isomorphism, when the 
treewidth of the pattern graph is bounded by t, is solvable in time 2'^^^'vP^^' . Cohen et al. gave 
a randomized algorithm that for an input digraph D decides in time 5.704 n^^^^ if D contains a 
given out-tree with k vertices |T2] • They also showed how to derandomize the algorithm in time 
Q.IA nP^"^' . Amini et al. introduced an inclusion-exclusion based approach in the classical 
color-coding and gave a randomized bA^rP^*' time algorithm and a deterministic f^ ,4f^+°W itPw 
time algorithm for the case when F has treewidth at most t. Koutis and Williams [2!Z] gener- 
alized their algebraic approach for A:-Path to fc-TREE and obtained a randomized algorithm 
running in time 2 rP^^' for A;-Tree. A superset of the authors in [18], extended this result by 
providing a randomized algorithm for A;-SuBGRAPH Isomorphism running in time 2^{nt)'^^'^\ 
when the treewidth of F is at most t. However, the fastest known deterministic algorithm for 
this problem prior to this paper, was the time 5.4 "'""^^n'*-' algorithm from |2|. In this paper 
we give deterministic algorithms for /c-Path and A;-Tree that run in time C'(2.85l'^nlog n) and 
C'(2.851^n'-'(^)). The algorithm for /c-Tree can be generalized to /c-Subgraph Isomorphism 
for the case when the pattern graph F has treewidth at most t. This algorithm will run in time 
0(2.851 n^*^). Our approach can also be applied to find directed paths and cycles of length k 
in time 0(2.851^^771 log^ n) and 0(2.85l'^n'^(-^)) respectively. 

Longest Directed Cycle. In the Longest Directed Cycle problem we are interested in 
finding a cycle of length at least A; in a directed graph. For this problem we give an algorithm 
of running time 0(8 ~''°' •'mn^logn). 

While at the first glance the problem is similar to the problem of finding a cycle or a path 
of length exactly A;, it is more tricky. The reason is that the problem of finding a cycle of length 
> k may entail finding a much longer, potentially even a Hamiltonian cycle. This is why color- 
coding, and other techniques applicable to /c-Path do not seem work here. Even for undirected 
graphs color-coding alone is not sufficient, and one needs an additional clever trick trick to 
make it work. The first fixed-parameter tractable algorithm for Longest Directed Cycle 
is due to Gabow and Nie [517, who gave algorithms with expected running time k 2^ 'nm 
and worst-case times 0{k'^^2^''^'nmlogn) or 0{k^^nm). These running times allow them to 
find a directed cycle of length at least log n/ log log n in polynomial time. Let us note, that 
our algorithm implies that one can find in polynomial time a directed cycle of length at least 
log n if there is such a cycle. On the other hand, Bjorklund et al. [5] have shown that assuming 
Exponential Time Hypothesis (ETH) of Impagliazzo et al. P^, there is no polynomial time 
algorithm that finds a directed cycle of length 0,{f(n) logn), for any nondecreasing, unbounded, 
polynomial time computable function / that tends to infinity. Thus, our work closes the gap 
between the upper and lower bounds for this problem. 

Minimum Equivalent Graph. Our next application is from exact exponential time algo- 
rithms, we refer to [17 for an introduction to the area of exact algorithms. In the Minimum 
Equivalent Graph (MEG) problem we are seeking a spanning subdigraph D' of a given di- 
graph D with as few arcs as possible in which the reachability relation is the same as in the 
original digraph D. In other words, for every pair of vertices u, v, there is a path from u to t; in 
D' if and only if the original digraph D has such a path. We show that this problem is solvable 
in time 0(2'^'^"m^n), where n is the number of vertices and m is the number of arcs in D. 

MEG is a classical NP-hard problem generalizing the Hamiltonian Cycle problem, see 
Chapter 12 of the book [3] for an overview of combinatorial and algorithmic results on MEG. 



The algorithmic studies of MEG can be traced to the work of Moyles and Thompson [3S] from 
1969, who gave a (non-trivial) branching algorithm solving MEG in time 0{n\). In 1975, Hsu 
in [21 discovered a mistake in the algorithm of Moyles and Thompson, and designed a different 
branching algorithm for this problem. Martello |30] and Martello and Toth [31] gave another 
branching based algorithm with running time 0(2"^). No single-exponential exact algorithm, 
i.e. of running time 2'^^"'', for MEG was known prior to our work. 

As it was already observed by Moyles and Thompson [SB] the hardest instances of MEG are 
strong digraphs. A digraph is strong if for every pair of vertices u y^ v, there are directed paths 
from M to f and from v to u. MEG restricted to strong digraphs is known as the Minimum 
SCSS (strongly connected spanning subgraph) problem. It is known that the MEG problem 
reduces in linear time to Minimum SCSS, see e.g. [131. 

Tree'width algorithms. We show that efficient computation of representative families can be 
used to obtain in time 2 '^•'n, where t is the treewidth of the input n-vertex graph, algorithms 
solving "connectivity" problems like Hamiltonian Cycle or Steiner Tree. It is well known 
that many intractable problems can be solved efficiently when the input graph has bounded 
treewidth. Moreover, many fundamental problems like Maximum Independent Set or Min- 
imum Dominating Set can be solved in time 2'^*-*^n, where t is the treewidth of the input 
n-vertex graph. On the other hand, it was believed until very recently that for some "connec- 
tivity" problems such as Hamiltonian Cycle or Steiner Tree no such algorithm exists. 
In their breakthrough paper, Cygan et al. P^ introduced a new algorithmic framework called 
Cut&Count and used it to obtain 2^^'''n^^^' time Monte Carlo algorithms for a number of con- 
nectivity problems. Very recently, Bodlaender et al. [7] obtained the first deterministic single 
exponential algorithms for these problems. Bodlaender et al. presented two approaches, one 
based on rank estimations in specific matrices and the second based on matrix-tree theorem and 
computation of determinants. Our approach, based on representative families in matroids, can 
be seen as an alternative path to obtaining similar results. The main idea behind our approach 
is that all the relevant information about "partial solutions" in bags of the tree decomposi- 
tion, can be encoded as an independent set of a specific matroid. Here efficient computation of 
representative families comes into play. 

In all our applications we first define a specific matroid and then show a combinatorial 
relation between solution to the problem and independent sets of the matroid. Then we compute 
representative families using Theorem [T] or Theorem [2] and use them to obtain a solution to 
the problem. We believe that expressing graph problems in "matroid language" is a generic 
technique explaining why certain problems admit single-exponential parameterized and exact 
exponential algorithms. 

Organization of the paper. In Section [2] we give the necessary definitions and state some 
of the known results that we will use. In Section [3] we prove Theorem [T] by giving an efficient 
algorithm for the computation of representative families for linear matroids. In Section [4] we 
prove Theorem [2] by giving an efficient algorithm for the computation of representative families 
for uniform matroids. In Section [5] we give all our applications of Theorems [T] and [2] Concluding 
remarks can be found in Section [6} The proofs of Theorem [T] and Theorem [2] are independent 
of each other and may be read independently. All of our applications use Theorems [T] and |2] 
as black boxes, and thus may be read independently of the sections describing the efficient 
computation of representative families. 

2 Preliminaries 

In this section we give various definitions which we make use of in the paper. 



Graphs. Let G be a graph with vertex set V{G) and edge set E{G). A graph G' is a subgraph 
of G if V{G') C V{G) and E{G') C £;(G). The subgraph G' is caUed an mdwced subgraph of G 
if E{G') = {uv G E{G) \ u,v G V{G')}, in this case, G' is also cahed the subgraph induced 
by V{G') and denoted by G[V{G')]. For a vertex set S, by G\S we denote G[y(G) \ S]. 
By A^(ii) we denote (open) neighborhood of u, that is, the set of all vertices adjacent to u. 
Similarly, by N[u] = N{u) U {u} we define the closed neighborhood. The degree of a vertex v 
in G is \Ng{v)\ and is denoted by d{v). For a subset 5 C V{G), we define iV[5] = U^esA^H 
and N(S) = N[S] \ S. By the length of the path we mean the number of edges in it. 

Digraphs. Let D be a digraph. By V{D) and A(D) we represent the vertex set and arc set 
of D, respectively. Given a subset V C V{D) of a digraph D, let -D[l^'] denote the digraph 
induced by F' . A digraph D is strong if for every pair x,y of vertices there are directed paths 
from X to y and from y to x. A maximal strongly connected subdigraph of D is called a strong 
component. A vertex tt of -D is an in-neighbor (out-neighbor) of a vertex v if uv G A{D) 
{vu £ A{D), respectively). The in-degree d^ (v) {out-degree d~^{v)) of a vertex v is the number 
of its in- neighbors (out-neighbors). We denote the set of in-neighbors and out-neighbors of a 
vertex v by N~{v) and N^(v) correspondingly. A closed directed walk in a digraph D is a 
sequence wqWi •••Vi of vertices of D, not necessarily distinct, such that vq = vi and for every 
0<i<e-l, ViVi+i G A{D). 

Sets, Functions and Constants. Let [n] = {1, . . . ,n} and (y) = {X \ X C[n\, \X\ = i}. 
Definition 2.1. Given two families of sets C\ and C2, we define 

Ci»C2 = {X[JY\XeCiandYeC2andXr\Y = 0}. 
Let £1, . . . , £,. be r families. Then 

'[Y Ci = Ci» ■■■• Cr- 

je[r] 

Definition 2.2. For two families A and B over (subsets of) U, we define 

Aol3 = {AUB : AgAABgB}. 

Definition 2.3. For a family A over (subsets of) U and set X, we define 

A®X = {AUX : AG A}. 

Throughout the paper we use cj to denote the matrix multiplication exponent. The current 
best known bound on uj < 2.373 [45,- We use e to denote the base of natural logarithm. 

2.1 Randomized Algorithms 

We follow the same notion of randomized algorithms as described in |33| Section 2.3]. That 
is, some of the algorithms presented in this paper are randomized, which means that they can 
produce incorrect answer, but the probability of doing so is small. We assume that the algorithm 
has an integer parameter P given in unary, and the probability of incorrect answer is 2~ . 



2.2 Matroids 

In the next few subsections we give definitions related to matroids. For a broader overview on 
matroids we refer to [39]. 

Definition 2.4. A pair M = {E,I), where E is a ground set and X is a family of subsets 
(called independent sets) of E, is a matroid if it satisfies the following conditions: 

(11) (/)GX. 

(12) If A' (Z A and A el then A' e 1. 

(13) If A, B el and \A\ < \B\, then 3 ee {B\A) such that Au{e} € X. 

The axiom (12) is also called the hereditary property and a pair {E,I) satisfying only (12) 
is called hereditary family. An inclusion wise maximal set of I is called a basis of the matroid. 
Using axiom (13) it is easy to show that all the bases of a matroid have the same size. This size 
is called the rank of the matroid M, and is denoted by rank(M). 

2.3 Linear Matroids and Representable Matroids 

Let A be a matrix over an arbitrary field ¥ and let E be the set of columns of A. Given A 
we define the matroid M = {E,I) as follows. A set X <^ E is independent (that is X G X) if 
the corresponding columns are linearly independent over F. The matroids that can be defined 
by such a construction are called linear matroids, and if a matroid can be defined by a matrix 
A over a field F, then we say that the matroid is representable over F. That is, a matroid 
M = {E,I) is representable over a field F if there exist vectors in F"' that correspond to the 
elements such that the linearly independent sets of vectors precisely correspond to independent 
sets of the matroid. Here, d =rank(M). A matroid M = {E,I) is called representable or linear 
if it is representable over some field F. 

2.4 Direct Sum of Matroids. 

Let Ml = {Ei,Ii), M2 = (^2,X2), ..., Mt = {Et,Xt) be t matroids with Ei n Ej = i/i for all 
I <i^ j <t. The direct sum Mi © • • • © M^ is a matroid M = {E,I) with E := ULi ^i and 
X <^ E is independent if and only if for all i < t, X D Ei G Xj. Let Ai be the representation 
matrix of Mj = (£'j,Xj). Then, 



A 



( Ai • • • \ 
^2 ••• 



M 



V • • • ^t / 

is a representation matrix of Mi © • • • © Mt. The correctness of this construction is proved 



m 



Proposition 2.1 ( |33| Proposition 3.4]). Given representations of matroids Mi, . . . ,Mt over 
the same field F, a representation of their direct sum can be found in polynomial time. 



2.5 Uniform and Partition Matroids 

A pair M = {E,X) over an n-element ground set E, is called a uniform niatroid if the family of 
independent sets is given by X = {A ^ E \\A\ < k} ^ where k is some constant. This matroid is 
also denoted as Un,k- Every uniform matroid is linear and can be represented over a finite field 
by a A; X n matrix Am where the Aj;/ [i, j] = j*""*^. 

/I 1 1 ... 1 \ 

12 3 ... n 

Am = 1 22 32 ... n2 

Observe that for Am to be representable over a finite field F, we need that the determinant of 
any k x k submatrix of Am must not vanish over F. The determinant of any k x k submatrix 
of Am is upper bounded by k\ x n^ (this follows from the Laplace expansion of determinants). 
Thus, choosing a field F of size larger than kl x n suffices. 

A partition matroid M = {E,I) is defined by a ground set E being partitioned into (disjoint) 
sets El, . . . ,Ei and by i non- negative integers ki, . . . ,ki. A set X C ii; is independent if and 
only if \X Ci Ei\ < ki for all i G {!,...,£}. Observe that a partition matroid is a direct sum of 



uniform matroids U\Ei\,kij' ' ' :U\Ei\,kr Thus, by Proposition 2.1 and the fact that a uniform 
matroid Un,k is representable over a field F of size larger than A;! x n we have that. 

Proposition 2.2 ( |33l Proposition 3.5]). A representation over a field of size 0{k\ x \E\ ) of 
a partition matroid can he constructed in polynomial time. 

2.6 Graphic Matroids 

Given a graph G, a graphic matroid M = {E,I) is defined by taking elements as edge of G (that 
is E = E{G)) and F C E{G) is in X if it forms a spanning forest in the graph G. The graphic 
matroid is representable over any field of size at least 2. Consider the matrix Am with a row for 
each vertex i G V{G) and a column for each edge e = ij £ E{G). In the column corresponding 
to e = ij, all entries are 0, except for a 1 in z or j (arbitrarily) and a —1 in the other. This is 
a representation over reals. To obtain a representation over a field F, one simply needs to take 
the representation given above over reals and simply replace all —1 by the additive inverse of 1 

Proposition 2.3 ([32])' Graphic matroids are representable over any field of size at least 2. 

2.7 Truncation of a Matroid. 

The t-truncation of a matroid M = {E,I) is a matroid M' = (E,I') such that S '^ E is 
independent in M' if and only if \S\ < t and S is independent in M (that is /S G X). 

Proposition 2.4 ( |33| Proposition 3.7]). Given a matroid M with a representation A over a 
finite field F and an integer t, a representation of the t-truncation M' can he found in randomized 
polynomial time. 

3 Fast Computation for Representative Sets for Linear Ma- 
troids 

In this section we give an algorithm to find a (/-representative family of a given family. We start 
with the definition of a q-representative family. 



Definition 3.1 (^-Representative Family). Given a matroid M = {E,I) and a family S of 
subsets of E, we say that a subfamily S Q S is g-representative for S if the following holds: for 
every set Y ^ E of size at most q, if there is a set X ^ S disjoint from Y with XUY G X, then 
there is a set X ^ S disjoint from Y with X UY G I. If S C S is q-representative for S we 
write S C^^^ S. 

In other words if some independent set in S can be extended to a larger independent set by q 
new elements, then there is a set in S that can be extended by the same q elements. A weighted 
variant of g-representative families is defined as follows. It is useful for solving problems where 
we are looking for objects of maximum or minimum weight. 

Definition 3.2 (Min/Max ^-Representative Family). Given a matroid M = {E,Z), a 
family S of subsets of E and a non-negative weight function w : S ^ N we say that a subfamily 
5 C 5 IS min ^-representative fmax g-representativej for S if the following holds: for every set 
Y C E of size at most q, if there is a set X ^ S disjoint from Y with X UY €^ I, then there is 
a set X £ S disjoint from Y with 

1. XUY el; and 

2. w{X) < w{X) (w{X) > w{X)). 

We use S '^minrep "^ (^ '^rnaxrep "^ ) ^^ denote a min q-representative (max q-representative) 
family for S. 

We say that a family S = {Si, . . . , St} of independent sets is a p-family if each set in S is of 
size p. 

We start by three lemmata providing basic results about representative sets. These lemmata 
will be used in Section [5] where we provide algorithmic applications of representative families. 
We proof them for unweighted representative families but they can be easily modified to work 
for weighted variant. 

Lemma 3.1. Let M = (E,I) be a matroid and S be a family of subsets of E . If S' C^^^p S and 
S C.% S', then S C^^^ S. 

Proof. Let y C S of size at most q such that there is a set X e S disjoint from Y with 
X UY S X. By the definition of q-representative family we have that there is a set X' S S' 
disjoint from Y with X' U F G X. Now the fact that S C^ S' yields that there exists a X G 5 
disjoint from Y with X U F G X. D 

Lemma 3.2. Let M = {E,I) be a matroid andS be a family of subsets of E. If S = 5iU- • -UiS^ 
and Si Q%p Si, then uf^iSi Q'j.^p S. 

Proof. Let y C E^ of size at most q such that there is a set X e S disjoint from Y with 
X U y G X. Since S = SiU ■ ■ ■ USi, there exists an i such that X e Si. This implies that there 
exists a X £ Si CI uf^iSi disjoint from Y with X U y G X. D 

Lemma 3.3. Let M = {E,I) be a matroid of rank k and Si be a pi-family of independent sets, 
S2 be a p2-family of independent sets, Si Qre^p^ '^i and S2 Qrep^ •^2- Then Si • ^2 ^rep^~^^ 

Si»S2. 

Proof. Let Y Q E of size at most q = k — pi — p2 such that there is a set X £ Si • S2 disjoint 
from y with X L) Y G X. This implies that there exist Xi G Si and X2 G ^2 such that 
Xi U X2 = X and Xi n X2 = 0. Since Si C.rep^ '^ii ^^ have that there exists a Xi G 5i such 
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that Xi U ^2 U y e X and Xi n {X2 U T) = 0. Now since ^2 ^rep'^ S2, we have that there exists 
a X2 G 52 such that Xi U X2 U F G X and X2 n {Xi U F) = 0. This shows that Xi U X2 G cSi • ^2 
and Xi U X2 U y e X thus Si • ^2 C^-Pi-P2 5i • 52- D 

The main result of this section is that given a representable matroid M = {E,X) of rank 
k = p+q with its representation matrix Am and ap-family of independent sets S = {5i, . . . , St}-, 
a non-negative weight function ■w : 5 — )• N, we can compute S Q^inrep "^ ^^"^ ^ '^maxrep "^ ^^ ^i^^ 
('^p ) deterministically in time O ( i^V^)tp'^ + t{^'^'^) 1. The proof for this result is obtained 
by making the known exterior algebra based proof of Lovasz \29\ Theorem 4.8] algorithmic. 
Although our proof is based on exterior algebra and is essentially the same as the proof given 
in [22, we give a proof here which avoids the terminology from exterior algebra and argues 
everything using the properties of determinants, thus making the proof self-contained. 

For our proof we also need the following well-known generalized Laplace expansion of de- 
terminants. For a matrix A = {aij), the row set and the column set are denoted by R(^) and 
C{A) respectively. For / C R(A) and J C C(^), A[I,J] = {atj \ i £ I, j £ J) means the 
submatrix (or minor) of A with the row set / and the column set J. For / C [n] let I = [n]\I 

and E ^ = Eie/ ^• 

Proposition 3.1 (Generalized Laplace expansion). For annxn matrix A and J C C{A) = [n], 
it holds that 

det(A)= J2 (-l)^^+^-^det(yl[/,J]])det(A[/,J]) 

IC[n],\IHJ\ 

We refer to |37| Proposition 2.1.3] for a proof of the above identity. We always assume 
that the number of rows in the representation matrix Am of M over a field F is equal to 
rank(M)=rank(^j\,f). Otherwise, using Gaussian elimination we can obtain a matrix of the 
desired kind in polynomial time. See |33 t Proposition 3.1] for details. We do not give the proof 
for Theorem [T] but rather for the following generalization. 

Theorem 3. Let M = {E,I) be a linear matroid of rank p + q = k, S = {Si, . . . , St} be 
a p-family of independent sets and w : S ^ N be a non-negative weight function. Then 
there exists S ^minrep ^ (^ ^fnaxrep ^) ^f *^'-^^ (^ p"^) ■ Moreover, given a representation 
Am of M over a field ¥, we can find S Qminrep '^ (^ ^maxrep •^J of size at most {^t,'^) in 
C ((^p')*P'' + ti^'^'^f^) operations over F. 

Proof. We only show how to find in the claimed running time S '^'^inrep '^- '^^^ proof for 
S '^maxrep "^ ^^ analogous, and for that case we only point out the places where the proof 
differs. If t < ( ), then we can take S = S. Clearly, in this case S '^minrep '5- ^o from now 
onwards we always assume that t > ( ) . For the proof we view the representation matrix Am as 
a vector space over F and each set Si £ S as a subspace of this vector space. For every element 
e £ E, let Xe be the corresponding fe-dimensional column in Am- Observe that each Xe G F . 

For each subspace Si £ S, i £ {1, - - - ,t}, we associate a vector Sj = AjeSi ^j i^ ^ ^^ follows. 
In exterior algebra terminology, the vector Si is a wedge product of the vectors corresponding 
to elements in Si- For a set 5 G 5 and / G (' '), we define s[I] = det{AM[I, S])- 
We also define 



Si = {si[I]) 



H";') 



Thus the entires of the vector Sj are the values of det(ylM[/, Si]), where I runs through all the 
p sized subset of rows of Am - 
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Let Hs = (si , . . . , St) be the ( ) x t matrix obtained by taking Si as columns. Now we define a 
weight function w' : C{Hs) — ^ M^" on the set of columns of Hs- For the column Si corresponding 
to Si G S, we define w'(sj) = u;(S'j). Let W be a set of columns of Hs that are linearly 
independent over F, the size of W is equal to the rank(Hs) and is of minimum total weight with 
respect to the weight function w' . That is, W is a minimum weight column basis of Hg. Since 
the row-rank of a matrix is equal to the column-rank, we have that \W\ =rank{Hs)< ( ). We 

define S = {Sa \ Sa £ W}. Let \S\ = L Because |W| = |5|, we have that i < (p). Without 
loss of generality, let S = {Si | 1 < i < ^} (else we can rename these sets) and W = {si . . . , si}. 
The only thing that remains to show is that indeed S Qminrep '^■ 

Let Sjs £ S he such that Sp ^ S. We show that if there is a set Y ^ E oi size at most q such 
that 5"^ n y = and Sp^JY G X, then there exists a set Sp £ S disjoint from Y with Sp\JY € X 
and w{Sp) < w{Si3). Let us first consider the case \Y\ = q. Since S^ n 1" = 0, it follows that 
15/3 U y| = p + q = k. Furthermore, since SjsUY £l, we have that the columns corresponding 
to 5/3 U y in Am are linearly independent over F; that is, det(^A/[R(^M), 5/3 U Y]) ^ 0. 

Recall that, s/3 = {si3[I])j/[k]\ , where spll] = det(^M[-f, 5/3]). Similarly we define y[L] = 

det {Am[L,Y]) and 

Let X; >^ = T,jeS(i J- Define 

HI') 



3.1 



Since ( ) = (/,_„) = („) the above formula is well defined. Observe that by Proposition 
have that 7(s/3,y) = det{AM['R{AM) , Sp U Y]) ^ 0. We also know that sg can be written as a 
linear combination of vectors in W = {si, 5*2, ... , Si}. That is, sg = J2i=i ^i^i, A, S F, and for 
some i, Aj / 0. Thus, 

7{sp,y) = J2{-l)J:i+EJsp[I]-y[I] 
I 

e 
= E Ai det(ylM[R(^M), 5i U Y]) (by Proposition [3?T| ) 



Define 

sup(5/3) = {Si \SieS, Aidet(>lM[R(^Af),5, uy])) / o}. 

Since 7(5/3, y) / 0, we have that (Ei=i Ai det(AM[R(^M), 5i U y])) / and thus sup(5/3) / 0. 
Observe that for all 5 e sup(5/3) we have that det(AM[R(^Af), 5uy]) / and thus 5uy G X. 
We now show that w{S) < w{Sp) for all 5 E sup(5/3). 

Claim 3.1. For all 5 G sup(5/3), u;(5) < ii;(5/3). 
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Proof. For contradiction assume that there exists a set Sj G sup(5^) such that w{Sj) > w{Si3). 
Let Sj be the vector corresponding to Sj and W' = (W U {sj}) \ {s^}. Since w{Sj) > w{Si3), 
we have that w{sj) > w{s'p) and thus w'(y\^) > w'{yV'). Now we show that W" is also a column 
basis of Hs- This will contradict our assumption that W is a minimum weight column basis of 
Hg. Recall that s)? = J2i=i ^i^i^ ^i ^ '^- Since Sj S sup(5'^), we have that Xj ^ 0. Thus Sj can 
be written as linear combination of vectors in W'. That is, 

Sj = Xpsp + Y^ A-Si. (1) 

Also every vector s-^ ^ W can be written as a linear combination of vectors in W 

i 
s^ = Y5iSi, (JiGF. (2) 

By substituting (II| into (|2|, we conclude that vector can be written as linear combination of 
vectors in W'. This shows that W' is also a column basis of Hg, a contradiction proving the 
claim. 

D 



Claim 3.1 and the discussions preceding above it show that we could take any set 5 € sup(5^) 



as the desired Sp G S. This shows that indeed S '^minrep "^ ^^^ each Y of size q. This completes 
the proof for the case \Y\ = q. 

Suppose that \Y\ = q' < q. Since M is a matroid of rank k = p + q, there exists a superset 
Y' G X of y of size q such that Sjs CiY' = 9 and Sj^ UY' G X. This implies that there exists 
a set 5 G 5 such that det{AM['R{AM), S U Y']) / and w{S) < w{S). Thus the columns 
corresponding to 5 U y are linearly independent. 

We consider the running time of the algorithm. To make the above proof algorithmic we 
need to (a) compute determinants and (b) apply fast Gaussian elimination to find a minimum 
weight column basis. It is well known that one can compute the determinant of a n x n matrix 
in time 0{n^) j9j. For a rectangular matrix A of size d x n (with d < n), Bodlaender et al. [7] 
outline an algorithm computing a minimum weight column basis in time 0{nd^~^). Thus given 
a p- family of independent sets S we can construct the matrix Hs as follows. For every set 
Si, we first compute Sj. To do this we compute det{AM[I , Si]) for every / G (^J). This can 

be done in time ©((^^ )p'^)- Thus, we can obtain the matrix H^ in time 0{{^~t'^)tp^). Given 
matrix Hg we can find minimum weight basis W of linearly independent columns of Hs of 
total minimum weight in time 0{tC '^) ). Given W, we can easily recover S. Thus, we can 

compute S Qminrep ^ in O [{^^'^)tp'^ + t{^~^'^) ] field operations. This concludes the proof 
for finding S ^minrep '^- '^o ^^^ ^ —maxrep '5' ^he Only change we need to do in the algorithm 
for finding S '^'^inrep '^ ^^ ^° ^^"^ ^ maximum weight colum,n basis W of Hg. This concludes the 
proof. D 

In Theorem [3] we assumed that rank(M)= p + q. However, one can obtain a similar result 
even when rank(M)> p + q in lieu of randomness. To do this we first need to compute the 



representation matrix of a ^-restriction of M = (E,I). For that we make use of Proposition 2.4 
This step returns a representation of a fc-restriction of M = {E,I) with a high probability. Given 
this matrix, we apply Theorem [3] and arrive at the following result. 
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Theorem 4. Let M = {E,I) be a linear matroid and let S = {Si, . . . ,St} be a p-faniily of 
independent sets. Then there exists S Q%p S of size {^J')- Furthermore, given a repre- 
sentation Am of M over a field F, there is a randomized algorithm computing S '^fep '^ *'^ 



'^)tp^ + ti^g) ) operations over ¥. 



4 Fast Computation for Representative Sets for Uniform Ma- 
troids 

In this section we show that for uniform matroids one can avoid matrix muhiphcation computa- 
tions ion order to compute representative famihes. The section is organized as follows. We start 
(Section 4.1 , Theorems from a relatively simple algorithm computating representative families 



over a uniform matroid. This algorithm is already faster than the algorithm of Theorem [T] for 
general matroids. In Section 4.2, Theorem |2] we give an even faster, but more complicated 



algorithm. Throughout this section a subfamily ^' C ^ of the family A is said to q-represent 
A if for every set B of size q such that there is an ^4 e ^ and vl n -B = 0, there is a set A' G A! 
such that A' n 5 = 0. 

4.1 Representative Sets using Lopsided Universal Sets 

Our aim is to prove the following theorem. 

Theorem 5. There is an algorithm that given a family A of sets of size p over a universe U 
of size n and an integer q, computes in time \A\ • (^~'''^) • 2°(p+'^-' • logn a subfamily A' ^ A such 

that \A'\ < (^+'^) • 2°(P+5) • logn and A' q-represents A. 

The main tool in our proof of Theorem [5] is a generalization of the notion of n,k-universal 
families. A family J^ of sets over a universe U is an n-k-universal family if for every set ^ C (^) 
and every subset A' <Z A there is some set F € J^ whose intersection F D A with A is exactly 
equal to A' . Naor et al. [S5] show that given n and k one can construct an n-A;-universal family 
J" of size 2'^+°(*'') • log n in time 2^~^"^^^ ■ n log n. 

We tweak the notion of universal families as follows. We will say that a family T of sets 
over a universe U is an n-p-q-lopsided-universal family if for every A £ () and B £ ( ^ ) there 
is an F G J^ such that ^4 C F and B D F = 0. An alternative definition that is easily seen 
to be equivalent is that J^ is n,p-g-lopsided-universal if for every subset A C ( , ) and every 

subset A' S ( ), there is an i^ G J^ such that F Pi A = A'. From the second definition it follows 
that a n-(p-|-g)-universal family is also n-p-f^-lopsided-universal. Thus the construction of Naor 
et al. [38] of universal set families also gives a construction of n-p-g-lopsided universal family 
of size 2P+'J"'"°(P"'"'J) • log n, running in time 2P+'?+°(p+'?) • n log n. It turns out that by slightly 
changing the construction of Naor et al. [38], one can prove the following result. 

Lemma 4.1. There is an algorithm that given n, p and q constructs a n-p-q-lopsided-universal 
family T of size (p+«) • 2°(p+9) • logn in time ©(f +'*') • 2°(p+9) • nlogn). 



We do not give a stand-alone proof of Lemma 4.1 however Lemma [4. 1| is a direct corollary of 



Lemma |4.2| proved in Section |4.2[ We will now show how to use the lemma to prove Theorem [5] 
Proof of Theorem [^ The algorithm starts by constructing a n-p-g-lopsided universal family J^ 



as guaranteed by Lemma 4.1 If |^| < \J^\ the algorithm outputs A and halts. Otherwise it 
builds the set A' as follows. We assume that n < \A\ ■ p, since otherwise some elements of the 
universe are not contained in any set in A and may safely be ignored. Initially A' is equal to 
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and all sets in T are marked as unused. The algorithm goes through every A^ A and unused 
sets F ^ T . If an unused set F G J-" is found such that A'^F ^ the algorithm marks F as used, 
inserts A into A! and proceeds to the next set in A. If no such set F is found the algorithm 
proceeds to the next set in A without inserting A into Al . 

The size of A! is upper bounded by | J^| < (^p'') • 2°^'^^^' ■ log n since every time a set is added 
to A' an unused set in J^ is marked as used. For the running time analysis, constructing J^ takes 

_i_ {0{ p+g ~i 

time (^"'"'') • 2 ^i°Eiog(p+q)> . nlogn. Then we run through all of J^ for each set A £ A, spending 
time 1^1 • I J"| • {p + q)'^^^\ which is at most |^| • (P+'?) • 2°(p+«) • log n. Thus in total the running 
time is bounded by |^| • (p+«) • 2°(p+'?) • logn. 

Finally we need to argue that A' g-represents A. Consider any set A £ A and B such that 
\B\ = q and AnB = ^. If ^ G ^' we are done, so assume that A ^ A' . Since T is n-p-g-lopsided 
universal there is a set F £ T such that A (^ F and FOB = 0. Since ^ ^ ^' we know that 
F was already marked as used when A was considered by the algorithm. When the algorithm 
marked F as used it also inserted a set A' into A' . For the insertion to be made, F must satisfy 
A' C F. But then A' nB = 9, completing the proof. D 

One of the factors that drive up the running time of the algorithm in Theorem |5] is that 
one needs to consider all of J- for each set A £ A. Doing some computations it is possible to 
convince oneself that in a n-p-g-lopsided universal family J^ the number of sets F £ T that 
contain a fixed set A of size p should be approximately | J-"| • (^r) • Thus, if we could only make 
sure that this estimation is in fact correct for every A £ A^ and we could make sure that for a 
given A £ A'we can list all of the sets in T that contain A without having to go through the 
sets that don't, then we could speed up our algorithm by a factor {—^) ■ This is exactly the 



strategy behind the main theorem of Section 4.2 



4.2 Representative Sets using Separating Collections 

The goal of this section is to prove the following theorem. 

Theorem p^ (restated). There is an algorithm, that given a family A of sets of size p over a 
universe U of size n and an integer q, computes in time 0(|^| •(2^)'^-logn) a subfamily A' C A 

such that \A'\ < i^^"^) ■ 2°(P+'') • logn and A' q-represents A. 

We say that a family J^ separates a set A from a set B if there is an F G J^ such that A <^ F 
and B n F = 0. The basic idea behind the proof of Theorem [2] is a construction of a small 
family J^ that separates every A £ A from every set B of size q such that AD B = 0. We now 
define separating collections, which is basically a term that allows us to speak about how small 
the computed family J^ is, and how efficiently we can compute it. 

A n-p-q- separating collection C is a pair (F', x) , where J-" is a family of sets over a universe U of 
size n and x is a function from ( ) to 2 such that the following two properties are satisfied; (a) 

for every A £ (^) and F £ x(^), AO F, (b) for every A £ (^) and B £ (^^'^), x(^) separates 
A from B. The size of {F,x) is \F\, whereas the max degree of {F,x) is niax^^fu\ |x(^)|- 

A construction of separating collections is a data structure, that given n, p and q initializes 
and outputs a family F of sets over the universe U of size n. After the initialization one can 
query the data structure by giving it a set ^ G ( ), the data structure then outputs a family 
x{A) C 2-^. Together the pair C = {F, x) computed by the data structure should form a 
n-p-g-separating collection. 

We call the time the data structure takes to initialize and output F the initialization time. 
The query time of the data structure is the maximum time the data structure uses to compute 
x{A) over all A G ( ) . The initialization time and query time of the data structure and the size 
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and degree of C are functions of n, p and q. The initialization time is denoted by Tj{n,p,q), 
the query time by TQ{n,p,q), the resulting size of C is denoted by Cif^^Pil)-, while the degree 
of C is denoted by A(n,p, g). The main technical component in the proof of Theorem p] is the 
following lemma. 



Lemma 4.2. There is a construction of separating collections with the following parameters 

• size C{ri,p,q) < (^+'3') • 2 '^iogiog(p+>j)'' • logn, 

• initialization time Ti{n,p,q) < ( ^ ) • 2 ^i°siog(p+q)> . nlogn, 

• degree A{n,p,q) < {^^y -2 'iogiog{p+9)^ . logn, and 



• query time TQ{n,p,q) < {^^Y • 2 ^iosiog{p+q)> . logn. 

We will first prove how Lemma |4.2| yields a proof of Theorem [2| The rest of the section 



contains a proof of Lemma 4.2 



Proof of Theorem [R The algorithm starts by constructing a n-p-g-separating collection (J^, x) 



as guaranteed by Lemma 4.2 If |^| < |J^| the algorithm outputs A and halts. Otherwise it 

builds the set A' as follows. Initially A' is equal to and all sets in J^ are marked as unused. 

The algorithm goes through every A £ A and queries the separating collection to get the set 

x{A). It then looks for a set F E x(^) that is not yet marked as used. The first time such a 

set F is found the algorithm marks F as used, inserts A into A' and proceeds to the next set 

in A. If no such set F is found the algorithm proceeds to the next set in A without inserting A 

into A'. 

The size of A' is upper bounded by \J^\ < i^ J^) • 2°^^~^'^i ■ log n since every time a set is added 

to A' an unused set in J^ is marked as used. For the running time analysis, the initialization 

I {^( p+<? \ 

of {J^,x) takes time ( p^) • 2 *• log iog(p+9) -• . nlogn. For each element A € A the algorithm first 

queries x(^)i using time (^^) • 2 ^iogiog(p+'j)'' . logn. Then it goes through all sets in x(^) ^-ud 
checks whether they have already been marked as used, taking time (2+2^ . 2 Mogiog(p+9) . ^Qg j^_ 
Thus in total, the running time is bounded by 0(|.4.| • (^^)'^ • 2°^^'^'^' ■ logn) as claimed. 

Finally we need to argue that A' g-represents A. Consider any set A € A and B such that 
\B\ = q and ACiB = f/>. If ^ G ^' we are done, so assume that A ^ A' . Since x(^) separates A 
from B there is a set F £ x(^) such that A <^ F and F FiB = 0. Since ^ ^ ^' we know that F 
was marked as used when A was considered by the algorithm. When the algorithm marked F 
as used it also inserted a set A' into A', with the property that F £ x(^')- Thus A' <^ F and 
hence A' n B = (ll. But A' £ A', completing the proof. D 



We now turn to the proof of Lemma 4.2 The proof is based on the splitters technique of 
Naor et al. [3H] • We start by giving a construction of separating collections with good bounds 
on the size and the degree, quite reasonable query time but with really slow initialization time. 

Lemma 4.3. There is a construction of separating collections with 

. size an,p,q) = 0(f+'?) • (p + g)^(i) • logn), 

• initialization time Ti{n,p,q) = 0{[.^^ ^) ■ rP^^'^''-'), 
. degree i:^{n,p,q) = 0{{'^Y ■ {p + q)'^'^^^ -logn), and 

• query time TQ{n,p,q) = 0{(^t,'^) ■ n^^^'). 

16 



Proof. We start by giving a randomized algorithm that with positive probabihty constructs a 
n-p-g-separating collection C = {J-, x) with the desired size and degree parameters. We will 
then discuss how to deterministically compute such a C within the required time bound. Set 
^ ~ pplq — (p+9 + 1) log '^ aiid construct the family T = {Fi, . . . ,Ft} as follows. Each set Fi is 
random subset of U, where each element of U is inserted into Fi with probability ^-. Distinct 
elements are inserted (or not) into Fi independently, and the construction of the different sets 
in J^ is also independent. For each A G ( ) we set x(^) = {^ G J-" : A C F}. 

The size of J^ is within the required bounds by construction, as ^^ p„ — " (p + <? + 1) log '^ < 

(^p*^) ■ {P + q)'^ ■ logn. We now argue that with positive probability {J^,x) is indeed a n-p- 
g-separating collection, and that the degree of C is within the required bounds as well. For a 
fixed set A G ( ), set -B G ( ^ ), and integer i < i, we consider the probability that A (^ Fi 
and i? n -Fj = 0. This probability is 



{p + q) 



p+q 



p + qJ \p + qJ p^q'' 

Since each F^ is constructed independently from the other sets in J^, the probability that no Fi 
satisfies A <^ Fi and S n Fj = is 

P^^'' \ < g-(p+(?+l)logn 1 



{p + q)P+i J ~ nP+9+i ' 

There are (") choices for A G ( ) and ("~p) choices for i? G ( ^ ), therefore the union bound 
yields that the probability that there exists an A G ( ) and i? G ( ^ ) such that J^ does not 
separate A from B is at most ^p+q+i ■ vP'^'^ = -. Since x{^) contains all the sets in J^ that 
contain A, x(^) separates A from B whenever T does. 





variable. For a fixed ^ £ („) and i < t the probability that A (1 Fi \s exactly (^^)^. Hence 



We also need to upper bound the max degree of C. For every A £ ( ), |x(^)| is a random 

ej 
|x(^)| is the sum of t independent 0/1 variables that each take value 1 with probability (i^^Y 



Hence the expected value of |x(^)l is 

EMA)\] = t . {^Y = (P-±A' -{p + q + l) logn. 

\p + qj \ q J 

Standard Chernoff bounds \iA\ Theorem 4.4] show that the probability that |x(^)| is at least 
6F;[|x(^)|] is upper bounded by 2-6^[l^(^)ll < -^j^. There are at most Q choices for ^ G (p). 
Hence the union bound yields that the probability that there exists an A G ( ) such that 
|x(^)| > 6£'[|x(A)|] is upper bounded by i. Thus C is a family of n-p-g-balanced universal sets 

with the desired size and degree parameters with probability at least 1 > 0. The degenerate 

case that 1 < is handled by the family J-" containing all (at most four) subsets of U . 

To construct J- within the stated initialization time bound, it is sufficient to try all families J- 
of size t and for each of the {^^, -,) guesses test whether it is indeed a family of n-p-g-balanced 

universal sets in time 0{t ■ n^(P+9)) = 0(n'^(P+'?)). 

For the queries, we need to give an algorithm that computes given A computes x(^)) under 
the assumption that J- has already has been computed in the initialization step. This is easily 
done within the stated running time bound by going through every set F £ J^, checking whether 
A <^ F, and if so, inserting F into x{^)- This concludes the proof. D 



We will now work towards improving the time bounds of Lemma 4.3, To that end we will 
need a construction of k-perfect hash functions by Alon et al. pQ. A family of functions fi,...,ft 
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from a universe U of size n to a universe of size r is a /^-perfect family of hash functions if for 
every set S" C [/ such that \S\ = k there exists an i such that the restriction of fi to S is 
injective. Alon et aL [I] give very efficient constructions of /c-perfect famihes of hash functions 
from a universe of size n to a universe of size k'^ 

Proposition 4.1 ([Ij. For any universe U of size n there is a k-perfect family /i, . . . ,/t of 
hash functions from U to {1,2, ... jk"^} with t = 0{k'-'^^' ■ log n) . Such a family of hash functions 
can be constructed in time 0{k^''^'nlogn). 



We now use Proposition |4.1| to give a universe reduction lemma that allows us to reduce 
the construction of separating collections to constructions of separating collections over a small 
universe. 

Lemma 4.4. If there is a construction of n-p-q-separating collections with initialization time 
Tj(n,p,q), query time TQ{n,p,q), producing a n-p-q-separating collection with size ({iT'^P^q) 0,^^'^ 
degree A(n,p,q), then there is a construction of n-p-q-separating collections using 

• size C{n,p,q) < C((p + q)'^,P,q) ■ {p + g)'^*-^-* -logn, 

• initialization time Tj{n,p, q) < C'(r/((p + q)'^,p,q) + C((p + q)'^-,P:Q) " (p + q)'~'^^' ■ nlogn), 

• degree A'{n,p,q) < A{{p -\- q)^,p,q) ■ {p -\- q)'-'^^' -logn, and 

• query time TQ{n,p,q) < 0{{tq{{p + q)'^,p,q) + A{{p + q)'^,p,q)) ■ {p + q)^'^'^'> -logn). 

Proof. We give a construction of n-p-g-separating collections with initialization time, query 
time, size and degree Tj, t'q, Q' and A' respectively using the construction with initialization 
time, query time, size and degree r/, tq, Q and A as a black box. 

We first describe the initialization of the data structure. Given n, p and q, we construct 



using Proposition 4.1 a (p + (7)-perfect family fi, ... ft of hash functions from the universe U to 
{1, 2, . . . , /c^}. The construction takes time 0{{p-\- q)'-^''^' nlogn) and t < {p-\- q)'-'''^' • logn. We 
will store these hash functions in memory. 

For a set 5" C [/, by fi{S) we will mean {fi{s) : s £ S}. Similarly for every 5 C 
{1, . . . ,{p -\- q)"^}, by ff (S) we will mean {s £ U : f{s) G S}. For a family Z of sets over U, 
by fi{2) we will mean {fi{S) : S £ Z}. Finally, for a family Z of sets over {1, . . . ,{p + q)'^}, 
by f,~\Z) we win mean {f~\S) : S e Z}. 

We first use the given black box construction for (p + g')^-p-g-separating collections over the 
universe {1, . . . , (p + q)'^}- This construction computes the separating collection {T, x). We run 
the initialization algorithm of this construction and store the family J^ in memory. We then set 

-^ = U/r'(-^)- 

We spent 0{{p-\-q)'^^^' nlog n) time to construct the a {p-\-q)-peiciect family of hash functions, 
0{T{{p-\-q)^,p, q)) to construct ^of size C{{p+(1)'^^Pj ?)) and 0{C{{p-\-q)^,p, q)-{p-\-q)'~'^^' -nlogn) 
time to construct J^ from J^ and the family of perfect hash functions. Thus the upper bound 
on Tj{n,p,q) follows. Furthermore, \J-\ < \fi\ • (p + g)'^'^^ -logn, yielding the claimed bound for 

c. 

We now define x{^) for every A G ( ) and describe the query algorithm. For every A E ( ) 
we let 

X{A) = U fr\x{fi{A))). 

i<t 

\MA)HA\ 
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Since fi{A) C F for every F £ xifii^)) it follows that A £ F for every F £ x{A). Furthermore 
we can bound |x(A)| as follows 

i<t 

mA)\=\A\ 

Thus the claimed bound for A' follows. To compute x{A) we go over every i < t and check 
whether /j is injective on A. This takes time 0{{p + g) '^' • logn). For each i such that fi is 
injective on A^ we compute fi{A) and then x[fi[A)) in time 0{tq{{p + q)'^^p,q) +p + q). Then 
we compute /-Hx(/i(^))) in time 0(|x(/i(^)) I • (p + (?)^(^)) = 0(A((p + g)2,p,g) • (p + (?)^W) 
and add this set to x{A)- As we need to do this 0{{p + q)*^^^' ■ logn) times, the total time to 
compute x{A) is upper bounded by 0{{tq{{p + q)'^,p, q) + /S.{{p + qf' ,p, q)) ■ {p + q)'-'^^' ■ logn), 
yielding the claimed upper bound on Tq. 

It remains to argue that (F, x) is in fact a n-p-g'-separating collection. Consider a set 
A £ { ) and ( ^\ ^ ). We need to show that x{A) separates A from B. Since fi,... ,ft is 
a (p + g)-perfect family of hash functions, there is an i such that fi is injective on AU B. 
Since {J^,x) is a (p + (/)^-p-g- separating collection, we have that x{fi{A)) separates fi{A) from 
fi{B). Thus /~ {x{fi{A))) separates A from B. Since fi is injective on A it follows that 
/j~ {x{fi{A))) C x{A), and hence x{A) separates A from B, concluding the proof. D 

We now give a splitting lemma, which allows us to reduce the problem of finding n-p-q- 
separating collections to the same problem, but with much smaller values for p and q. To that 
end we need some definitions. 

A partition of C/ is a family Up = {Ui, U2, ■ ■ ■ Ut} of sets over U such that Ui riUj = 
for every i ^ j and U = ljj<i ^i. Each of the sets Ui are called the parts of the partition. A 
consecutive partition of {1, . . . , n} is a partition Up = {Ui, U2, ■ ■ ■ Ut\ of {1, . . . , n} such that 
for every integer i <t and integers 1 < x < y < z, \i x £Ui and z £ Ui then y £ Ui as well. In 
other words, in a consecutive partition each part is a consecutive interval of integers. For every 
integer t, let ^t denote the collection of all consecutive partitions of {1, . . . ,n} with exaclty 
t parts. We do not demand that all of the parts in a partition in ,'^t are non-empty. Simple 
counting arguments show that for every f, \^t\ = C't-i )• 

We will denote by Z^^ the set of all f-tuples {pi,p2, ■ ■ ■ ,pt) of integers such that J2i<tPi — P 
and < Pi < s for all i. Clearly \Zgf\ < {^^Ji ), since this counts all the ways of writing p as a 
sum of t non-negative integers, without considering the upper bound on each one. 

Lemma 4.5. For any p, q let s = [{log{p + q))'^ \ and t = [^^] • If there is a construction of n- 
p-q- separating collections with initialization time Ti{n,p,q), query time TQ{n,p,q), producing a 
n-p-q- separating collection with size Ci'^^iPil) ^''^d degree A(n,p,q), then there is a construction 
of n-p-q- separating collections using 

. size C{n,p,q) < \^t\ ■ T,(pi,...,pt)ezl^I\i<tC{n,Pi, s - pi), 

• initialization time Tj{s,p,q) < 0{{Yl,p<sTi{''^-,P-,s — p)) + C{n,p,q) ■n'^^^>\, 
. degree A'{n,p,q) < \^t\ • max(p^^ ^^-jg^f^ nj<i A(n,pi, s - pi), and 

• query time TQ{n,p,q) < 0{A'{n,p,q) ■ n'^(^) -\- \^t\ • t ■ Y.pKs^Qi'^^P^^ -p))- 

Proof. Set s = [(log(p -|- (?))^J, t = [^^] and q = st — p. We will give a construction of n-p- 
g-separating collections with initialization time, query time, size and degree within the claimed 
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bounds for Tj{n,p,q), TQ{n,p,q), C,'{n^p,q) and A'{n,p,q) respectively. In the construction we 
will be using the construction with initialization time, query time, size and degree r/, tq, C and 
A as a black box. Since q > q, a n-p-g-separating collection is also a n-p-(7-separating collection. 
We may assume without loss of generality that C/ = { 1 , . . . , n} . 

Our algorithm runs for every p between and s the initialization of the given construction 
of n-p-{s — p)-separating collections. We will refer by {Tp, xp) to the separating collection 
constructed for p. For each p the initialization of the construction outputs the family J^p. 

We need to define a few operations on families of sets. For a family A over U and subset 
[/' C [/ we define 

Anu' = {AnU : AeA}. 

For two families A and B over (subsets of) U, we define 

AoB = {AUB : AeAABeB}. 
We now define J^ as follows. 

:f= U (J-p, nc/i)o(jp^nc/2)o...o(j-p^nc/i) (3) 

{Ui,...,Ut}&£y't 

It follows directly from the definition of J- that |J-"| is within the claimed bound for Q'{n,p,q). 
For the initialization time, the algorithm spends 0{^p^gTi{n,p, s — p)) time to initialize the 
constructions of the n-p-{s — p)-separating collections. At that point the algorithm can output 
the entries of J- one set at a time by using Q, spending 0{rP^^>) time per output set. 
For every set A S ( ) we define x(^) as follows. 

x{A)= U \{xpAAr^Ul)v^Ul)o{xvM^U2)v^U2)o... (4) 



{c/i,...,[/t}e5^t s.t 
^Ui : pHUinA\<s 



o{xpMr^Ut)nUt) 



It follows directly from the definition of x(^) that |x(^)| is within the claimed bound for 
A'(n,p,q). We now describe how queries x(^) can be answered, and analyze how much time 
it takes. Given A we will compute x(^) using Q. For each {Ui, . . . ,Ut} E ^t such that 
Pi = {UiD A\ < s for all i < t, we proceed as follows. First we compute Xpii^ ^ Ui) for each 
i<t, spending in total C(X]j<t''"(5(f^, Pi, s — Pi)) time. Now we add {xpi{^^Ui)\lUi) o (^Xp^(^An 
U2) n U2) o . . . o {xp^{A n Ut) n Ut) to x(^)i spending rP^'^'^ time per set that is added to x{^)j 
yielding the bound below for TQ{n,p, q). 

T'Q{n,p,q)<0(^A'{n,p,q)-n'^^^^+ ^ [^ ^^("'^i' ^ " ^0]) 

{Ui,...,Ut}£^t s.t i<t 

vUi : pi=\UinA\<s 

<o(A'(n,p,g)-n^(^) + |^i| • max ( VrQ(n,pi, s -pi))) 
V ip„...,p,)&Zl, ^ ^ 

< 0(^A'{n,p,q) ■ n^W + |^^| . ^ . ^ TQ{n,p, s - p 

p<s 

We now need to argue that {J-, x) are in fact a n-p-g-separating collection. Consider two 
sets A G ( ) and B ^ ( \ ). There exists a consecutive partition {Ui, . . . , Ut} € I^t of U such 
that for every i < t we have that \{AU B) nUi\ = ^±2 = s. For each i <t set pi = \AnUi\ and 
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Qi = \B nUi\ = s — Pi. For every i < t the pair (-7^,,Xpi) form a n-pj-^j-separating collection. 
Hence Xpii^ n Ui) separates AnUi from BnUi. Let Fi G Xpi(^ n t/j) be such that AnUi <^ Fi 
and Fi n B n C/j = 0. Let F = Uj<t(-^i n C/j). Then A C F and B n F = 0. The construction of 
x{A) ensures that F G x(^) and thus x{^) separates A from S, completing the proof. D 

We are now ready to prove Lemma |4.2[ 



Proof of Lemma 4-2 The structure of the proof is simple; first we create a construction using 



Lemma |4.3[ Applying Lemma |4.4| gives a new, second, construction. We then make more 



constructions by repeatedly applying Lemmata 4.5 and|4.4[ Specifically the third construction 



is obtained by obtaining Lemma 4.5 to the second, the fourth by applying Lemma 4.4 to the 



third, the fifth by applying Lemma 4.5 to the fourth and the sixth and final construction is 
obtained by applying Lemma 4.4 to the fifth. The bulk of the work is to verify that the 
respective constructions indeed have the claimed parameters. We now proceed with the formal 
proof. 



Apply Lemma 4.3 and get a construction of n-p-g-separating collections with the following 
parameters. 

. sizeCHn,p,q) < 0((P+«) • (p + g)^(i) • logn), 

. initialization time TJ{n,p,q) < 0{{^^^"p^^) ■ n'^(P+i^) < 2"(p+9)'', 

. degree A^{n,p,q) < C'((2±2)'? • (p + ^)'^(i) -logn), and 

. query time r^(n,p,g) < C'((p+^) • n^^^'^) < n^(P+i). 

We apply Lemma |4.4| to this construction and get a new construction with the following 
parameters 

. size C{n,p,q) < 0((P+^) ■ {p + q)'^'-^^ -logn), 

. initialization time T]{n,p, q) < 0(2(^+9)'^+"' + (p+i) • {p + qfW ■ nlogn), 

. degree A'^{n,p,q) < 0{{^Y ■ {p + q)^'^^^ -logn), and 

• query time TQ{n,p,q) < 0{{p + q)'^^P^'^' -logn). 



We now apply Lemma |4.5| to this construction. Recall that in Lemma 4.5 we set s = 
[(log(p + q))'^\ and t = [2±2"|_ This yields a new construction of separating collections, with 



size 



CHn,p,q) < \^t\- J2 Y[C\n,pi,s-pi) 

ipi,...,Pt)(^Zl,i<t 

< 2'^^i°g'(p+9)^ • max TT ( ^ ) •(s)*^(^) -logn 

iP^^-^Pt)^Kti<t\Pi/ 

< 2 ^iog(p+9)'' . logn iog2(p+g) . T[i3yi If 



< 



K P / 
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Now we bound the initialization time. 



< sr'^' + 2^(^) . nlogn + [^ + '^] • 2^(n^5^) • logn^^^^^^.^^ • n^(i) 

For the degree, we get the fohowing bound. 

A^{n,p,q) < \^t\- max J]^ A^(n,pi, s - pi) 
(Pi,...,pt)e2f,, .<j 

< 2^(i;^i^). max T]\(^—Y-'''-s'''^^'>-logn 



< 2 ^^°s{p+q)> -logn ^i°g"{p+9)-' . max IT ( — ) 

(,i,...,,Oe<rf<t 9i 

/n/- P+g 'I 0( Si2 ) 1 

< 2^i°g(p+'J)'^ -logn ^iog2{p+9)^ . (^sty max 



(gi,...,gt)e2f, ni<i(9i-i)^' 



ni' P+1 \ Oi g±2 \ 1 

< 2^1°g(P+'j)'^ • logn ^log2{P+9)-' . (p+ g)9 . _ 



^ ^ ) . 2 *■ iog(p+9) •' . logn '°s^(p+'J) 

In the second to last transition, the inequality 

1 1 

max — — < — 

(9i,...,9062,%ni<t(9i-t)''^ g^ 

follows from Gibbs' inequality |23| Lemma 22.2]. We now consider the bound for the query 
time. 

T%{n,p,q) < 0(A3(n,p,(7)-n^« + |^i|-t-^T^(n,p,s-p)) 

p<s 

rp + q\l o{ p+1 \ _ 0( — B±2- 



< 



(^P + qy . ^Oi^^i^) . i^^^^ilj^,) . ^O(i) 



We now apply Lemma 4.4 to construction number 3, and obtain construction number 4, 
with the following bounds. 

I r}( p+q \ 

• size C \'n,p,q) < i^^'') -2 *■ i°g(p+'?) '^ -logn, 

. initialization time Tf{n,p,q) < 2(l°g(P+9))"°"'^+'"'+' + (P+9) • 2^^^^^S^^ ■ nlogn, 

• degree A'^(n,p,(7) < (2±2j'? . 2 Mog(p+q)-' . logn, and 

• query time rQ(n,p,g) < (2±2j'' . 2 Mog(p+<j)) . logn. 

To get construction number 5 from construction number 4 we apply Lemma |4.5| again. Just 
as in the analysis of construction number 3, we set s = [(log(p + q))'^\ and t = [^^] . The size 

22 



of the obtained construction can be bounded as follows. 

C'(n,p,g) < \^t\- E X{C\n,pus-pi) 

< 2'^^i^^iS+i)^ • max TT I '^ I • 2*^^i^^ • loen 

< 2 ^^°si°s(p+i)' ■ logn ^topb+^y . max TT 

(V + q\ n( p+g 1 Oi" S±2 ^ 

. 2^Moglog{p+9)'' -logn Mog2(p+5)'' 

Next we consider the initialization time. 

Tf{s,p,q) < 0((5]r/(n,p,s-j3))+C'(n,p,g)-n^W) 

p<s 

< ,2('°s^)"°^^''+^ + 2^W • nlogn + [^ + '^y 2^(i^iiSlT^) • logn^^^^^^^^^ • n^^^) 

< ['^ + '^y2^(n^iig(lT^).logn^(^Sif^).^0(i) 

In the last transition we insert log {p + q) for s and observe that 

^2(logs)0°s^)'+3 ^ 2(21oglog{p+9))(2l°slog(P+9))'+4 ^ 2^^ l°glog?P+g) ). 

We proceed to bound the degree. 

A^{n,p,q) < \^t\- max J| A*^(n,pi, s -pj) 

(pi,...,Pt)e2L j<^ 

< 2^^i°^<^+i)'^ ■ max TT [(^^)'"^' • 2*^^!^^ • logn 

< 2^i°gi°g(p+'j)'^ -logn ^i°g^{p+9)' . max 1 f (— ) 



/T)-\-a\'i /---I I- p+g \ fof p+g ^ 

< [ ] • 2^i°Ei°g(p+'3)'^ • logn '°s^(p+«) 



Here the last transition follows from an analysis identical to the last three transitions in the 
bound for A^. We now consider the query time. 

rQ{n,P,q) < 0(A5(n,p, g) • n^« + |^i| • t • ^ T^(n,p, s - p)) 

p<s 

. ,, - . .,, p+g 



fP + q\i o( E+R ) , o( — e±2 — ) „,.. 

{ ) • 2^1°Slog{p+g)-' . logn Mog2(p+9)'' . ;^<-'(l) 



Finally, we apply Lemma 4.4 to construction number 5, and obtain construction number 6, 
with the following bounds. 



^1 ( p-t-g \ 

size C^{n,p,q) < (p+'^) • 2 ^-logiogCp+g)-* . logn, 

initialization time Tj{n,p,q) < (P+'') • 2 ^iogiog(p+g)'' . nlogn. 
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• degree A.^{n,p,q) < [^^y -2 ^i°siog(p+q)) . logn, and 

• query time rQ(n,p,g) < (2±2)'' • 2 *• log logCp+g) •' • logn. 

Construction 6 has the parameters claimed in the statement of the lemma. This concludes the 
proof. n 

5 Applications 

In this section we demonstrate how the efficient construction of representative families can 
be used to design single-exponential parameterized and exact exponential time algorithms. 
Our applications include best known deterministic algorithms for Longest Directed Cycle, 
Minimum Equivalent Graph, fc-PATH and A;-Tree. We also provide alternate deterministic 
algorithms running in time 2 '*'n for "connectivity" problems such as Hamiltonian Cycle 
or Steiner Tree on n- vertex graphs of treewidth at most t. 

Let M = {E,X) be a matroid with the ground set of size n and S = {Si, . . . , St} be a p- 
family of independent sets. Then for specific matroids we use the following notations to denote 
the time required to compute the following g-representative families of S: 

• 'Trm{t,p,q) is the time required to compute a family S C^ S of size {^'t'^), when M is a 
linear matroid. 

• Tum{t,p, q) is the time required to compute a family S C^^^ S of size {^'l'^) ■ 2°^p~^'i> ■ log n, 
when M is a uniform matroid. 



Let us remind, that by Theorem [I] when rank oi M is p + q, Trm{t,p,q) is bounded by 
(P+Q^tpu^ ^ t(p+Q^'^-^'^ multiplied by the 1 

Theorem|2| rum{t,P,q)= 0{t ■ {^y -logn) 



o(( 



^ '^)tp'^ + 1( ) ) multiplied by the time required to perform operations over F. By 



5.1 Long Directed Cycle 

In this section we give our first application of algorithms based on representative families. We 
study the following problem. 



Longest Directed Cycle Parameter: k 

Input: A n- vertex and m-arc directed graph D and a positive integer k. 
Question: Does there exist a directed cycle of length at least k in D? 



Observe that the Longest Directed Cycle problem is different from the well-known problem 
of finding a directed cycle of length exactly k. It is quite possible that the only directed cycle 
that has length at least k is much longer than k, and possibly even is a Hamiltonian cycle. Let 
D he a. directed graph, k he a. positive integer, and M = {E,I) he a uniform matroid Un^2k 
where E = V{D) and I = {S CI V{D) \ \S\ < 2k}. In this subsection whenever we talk about 
independent sets, these are independent sets of the uniform matroid Un,2k- For a pair of vertices 
u,v & V{D), we define 

P*^ = IX X CI V{D), u,v (z X, \X\ = i, and there is a directed uv-path in D 

of length i — 1 with all the vertices belonging to X. > 

We start with a structural lemma providing the key insight to our algorithm. 
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Figure 1: Illustration to the proof of Lemma 5.1 



^1^2 ■■■Vk 

uv —rep '^uv there exists a directed 



Lemma 5.1. Let D he a directed graph. Then D has a directed cycle of length at least k if 
and only if there exists a pair of vertices u,v ^ ^{D) and X G V^y ^rep '^uv such that D has 
a directed cycle C and in this cycle vertices of X induce a directed path (that is, vertices of X 
form a consecutive segment in C). 

Proof. The reverse direction of the proof is straightforward — if cycle C contains a path of 
length k, the length of C is at least k. We proceed with the proof of the forward direction. Let 
C* = t;iU2 • • • VrVi be a smallest directed cycle in D of length at least k. That is, r > /c and 
there is no directed cycle of length r' where k < r' < r. We consider two cases. 

Case A: r < 2k. If r < 2k, then we take u = vi and v = Vk- We define paths P 

and Q = v^+i • • • Vr- Because \Q\ < k, by the definition of V!^ 

ra-path P' such that X = V{P') £ V^^ and X n Q = 0. By replacing P with P' in C* we 

obtain a directed cycle C of length at least k containing P' as a subpath. 

Case 'B: r > 2k + 1. In this case we set u = vi, v = v^, and split C* into three paths 
P = vi---Vk, Q = Vk+i ■ ■ ■ V2k, and R = V2k+i ■■■Vr- Since \Q\ = k and V^^ C^^^ V^^, it follows 
that there exists an 'u^;-path P' such that X = V{P') G V^^ and X n Q = 0. However, P' is 
not necessarily disjoint with R and by replacing P with P' in C* we can obtain a closed walk 
C containing P' as a subpath. See Fig. [I] for an illustration. 

If Xni? = 0, then C is a simple cycle and we take C as the desired C. We claim that this is 
the only possibility. Let us assume targeting towards a contradiction that X n i? / 0. We want 
to show that in this case there is a cycle of length at least k but shorter than C* , contradicting 
the choice of C*. Let Va be the last vertex in X f] R when we walk from vi to Vk along P' . Let 
P'ivajVk] be the subpath of P starting at Va and ending at Vk- If Va = V2k+ii we set R' = 0. 
Otherwise we put R' = R[v2k+i,Va-i] to be the subpath of R starting at V2k+i and ending at 
Va-i- Observe that since the arc Va-iVa is present in D (in fact it is an arc of the cycle C*), 
we have that C = P'[va,Vk]QR' is a simple cycle in D. Clearly, \C\ > \Q\ > k. Furthermore, 
since vi is not present in P'[va,Vk] we have that |P'[t;Q.,'f^fc]| < |-P'| = |-P|- Similarly since Va is 
not present in R' , we have that \R'\ < \R\. Thus we have 

k<\C\ = \P'[va,Vk]\ + \Q\ + \R'\ < \P\ + \Q\ + \R\ = \C*\. 

This implies that C is a directed simple cycle of length at least k and strictly smaller than 
r. This is a contradiction. Hence by replacing P with P' in C* we obtain a directed cycle C 
containing P' as a subpath. This concludes the proof. D 
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Next lemma provides an efficient computation of family "P^^, C.^^ V, 



uv — rep ' uv 



Lemma 5.2. Let D be a directed/unidrected graph with n vertices and m edges, u € V{D) and 
M = {E,I) be an uniform matroid Un/ where E = V{D) and I = {5" C V{D) \ \S\ < I}. Then 
for every p < £ and v G V{D) \ {u}, a family pP^ ^r^p T-'uv '^f "^^'-^^ ^* most 



M-2°W.Wn 




\PJ 
can be found in time 

O (2°(^)mlog^nmax 

Furthermore, within the same running time every set in V^^ can be ordered in a way that it 
corresponds to a directed (undirected) path in D. 

Proof. We prove the lemma only for digraphs. The proof for undirected graphs is analogous and 
we only point out the differences with the proof for the directed case. We describe a dynamic 
programming based algorithm. Let V{D) = {u,vi, . . . ,Vn-i} and D he a {p — 1) x n matrix 
where the rows are indexed from integers in {2, ... ,p} and the columns are indexed from vertices 
in {vi, . . . , Vn-i}- The entry V[i, v] will store the family P^^ ^rep T^l- We fill the entries in the 
matrix D in the increasing order of rows. For i = 2, 'D[2,v] = {{u,v}} if uv S A{D) (for an 
undirected graph we check whether u and v are adjacent). Assume that we have filled all the 
entries until the row i. Let 

A/;ir= U Kn^'M- 
w£N-{v) 

For undirected graphs we use the following definition 

Kt'= U -PL'iv}. 

wGN{v) 

Claim 5.1. Mit' dep^'^'^ rty. 

Proof. Let S E V^^ and 1" be a set of size £ — {i + 1) (which is essentially an independent 
set of Un/) such that S CiY = 9. We will show that there exists a set S' S M^t^ such that 
5" n y = 0. This will imply the desired result. Since S G "P^^ there exists a directed path 
P = uai---ai-.iv in D such that S = {u,ai, . . . ,ai-i,v} and Oj G N~{v). The existence of 
path P[u,ai-i], the subpath of P between u and Oj-i, implies that X* = X \ {v} G T-'uai-i- 
Take Y* = YU{v}. Observe that X* nY* = ^ and \Y*\ =i-i. Since Vf,^^_^ cf-^ p^^^_^ there 
exists a set X* G ^Li_i such that X* nY* = 0. However, since aj_i G N~{v) and X*n{v} = 
(as X* n y* = 0), we have X* • {v} = X* U {v} and X* U {v} G A/'J+^ Taking S' = X* U {v} 
suffices for our purpose. This completes the proof of the lemma. D 

We fill the entry for P[i + 1, f] as follows. Observe that 

Kt'= U -Dii^wj.M. 

We already have computed the family corresponding to T)[i,w] for w G N~{v). By Theorem pi 
I'Puwl ^ (i)2°*^^-' logn and thus jA/"^^^! < d~{v){f)2°^^Hogn. Furthermore, we can compute 
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Miy in time O id (i;)(^)2°W lognj. Now using Theorem 2 we compute Mi^ ^fep""^ ^tt 



in time Tuin{t,i + 1,1 — i — 1), where t = d{v)[^2°^>\ogn. By Claim 5.1 we know that 
Kt^ '^iet^ KV- Thus Lemma l^l imphes that AA^+i = 'Pi+^ Qiep~^ VJ^^ . We assign this 
family to T)[i + l,f]. This completes the description and the correctness of the algorithm. We 
give ordering to the vertices of the sets in V^^ in the following way so that it corresponds to a 
directed (undirected) path in D. We keep the sets in the order in which they are built using 
the • operation. That is, we can view these sets as strings and • operation as concatenation. 
Then every ordered set in our family represents a path in the graph. The running time of the 
algorithm is bounded by 



o(|:|r„.(.-,,,(,!^ 



p n-1 / 

\ i=2j=l V ^ 



O 2°(^)log2nmmax<^ 




This completes the proof. D 

Finally, we are ready to state the main result of this section. 

Theorem 6. Longest Directed Cycle can he solved in time O {8'''^°'^'^^ mri^ log n). 

Proof. Let D be a directed graph. We solve the problem by applying the structural characteri- 
zation proved in Lemma |5.1[ By Lemma |5.1[ D has a directed cycle of length at least k if and 
only if there exists a pair of vertices u,v & ^(^) and a path P' with V{P') G V^^ '^rep '^uv 
such that D has a directed cycle C containing P' as a subpath. 



We first compute V^^ '^rep '^uv ^^ ^,11 u,v £ y{D). For that we apply Lemma 



5.2 



for 



each vertex u S V{D) with £ = 2k and p = k. Thus, we can compute "P^^, C^^^ V^^ for all 
u,v £ V{D) in time O (8^°^ 'mnlog nj. Moreover, for every X S V!^y we also compute a 
directed nu-path Px using vertices of X. Let 

Q= U ^i- 

u,veV{D) 

Now for every set X £ Q and the corresponding liv-path Px with endpoint, we check if there 
is a uv-paih in D avoiding all vertices of X but u and v. This check can be done by a standard 
graph traversal algorithm like BFS/DFS in time (D{m + n). If we succeed in finding a path 
for at least one X € Q, we answer YES and return the corresponding directed cycle obtained 
by merging Px and another path. Otherwise, if we did not succeed to find such a path for 
none of the sets XQ, this means that there is no directed cycle of length at least k in D. The 
correctness of the algorithm follows from Lemma 5.1 By Theorem [2] the size of Q is upper 



bounded by n^^ ^)2°^ ' logn < n^4 "'""^ ■* logn. Thus the overall running time of the algorithm 
is upper bounded by 

0(8fc+o(fc)^„log2 n + 4^+°^''\n'^m + n^) logn). 
This concludes the proof. D 
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5.2 Minimum Equivalent Graph 

For a given digraph Z?, a subdigraph D' of D is said to be an equivalent subdigraph of D if for 
any pair of vertices u,v G V^D) if there is a directed path in D from u to v then there is also 
a directed path from u to v in D' . That is, reachabiHty of vertices in D and D' is same. In 
this section we study a problem where given a digraph D the objective is to find an equivalent 
subdigraph of D' of D with as few arcs as possible. Equivalently, the objective is to remove the 
maximum number of arcs from a digraph D without affecting its reachability. More precisely 
the problem we study is as follows. 



Minimum Equivalent Graph (MEG) 

Input: A directed graph D 

Task: Find an equivalent subdigraph of D with the minimum number of arcs. 



The following proposition is due to Moyles and Thompson ^36], see also [3| Sections 2.3], 
reduces the problem of finding a minimum equivalent subdigraph of an arbitrary D to a strong 
digraph. 

Proposition 5.1. Let D be a digraph on n vertices with strongly connected components Ci, . . . ,Cr- 
Given a minimum equivalent subdigraph C'l for each Ci, i £ [r], one can obtain a minimum 
equivalent subdigraph D' of D containing each of C[ in 0{n^) time. 

Observe that for a strong digraph D any equivalent subdigraph is also strong. By Proposi- 



tion 5.1 MEG reduces to the following problem. 



Minimum Strongly Connected Spanning Subgraph (Minimum SCSS) 

Input: A strongly connected directed graph D 

Task: Find a strong spanning subdigraph of D with the minimum number of arcs. 



It seems to be no established agreement in the literature on how to call these problems. MEG 
sometimes is also referred as Minimum Equivalent Digraph and Minimum Equivalent 
Subdigraph, while Minimum SCSS is also called Minimum Spanning Strong Subdigraph 
(MSSS). 

A digraph T is an out-tree (an in-tree) if T is an oriented tree with just one vertex s of 
in-degree zero (out-degree zero). The vertex s is the root of T. If an out-tree (in-tree) T is a 
spanning subdigraph of D, T is called an out-branching (an in-branching). We use the notation 
Bf {Bf) to denote an out-branching (in-branching) rooted at s of the digraph. 

It is known that a digraph is strong if and only if it contain an out-branching and an 
in- branching rooted at some vertex v G V{D) [3, Proposition 12.1.1]. 

Proposition 5.2. Let D be a strong digraph on n vertices, let v be an arbitrary vertex ofV{D), 
and i < n — 2 be a natural number. Then there exists a strong spanning subdigraph of D with 
at most 2n — 2 — i arcs if and only if D contains an in-branching B~ and an out-branching B^ 
with root V so that \A{B^) n A{B~)\ > i (that is, they have at least i common arcs). 

Proposition |5.2| implies that the Minimum SCSS problem is equivalent to finding, for an 
arbitrary vertex v G V{D), an out-branching B^ and an in-branching B~ that maximizes 
|yl(i?^) n A{B~)\. For our exact algorithm for Minimum SCSS we implement this equivalent 
version using representative sets. 

Let D he a strong digraph and s G V(D) be a fixed vertex. For v G y{L)) we use ln(u) and 
Out(w) to denote the sets of in-coming and out-coming arcs incident with v. By Dj we denote 
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the digraph obtained from D by deleting the arcs in Out(s). Similarly, by Df we denote the 
digraph obtained from D by deleting the arcs in ln(s). 

We take two copies E\,E2 of A{D) (that is E'j = {cj | e G A{D)}) , a copy £'3 of A{Df) and 
a copy £^4 of A{D~) and construct four matroids as follows. Let U{D) denote the underlying 
undirected graph of D. The first two matroids Mi = (£'i,Xi), M2 = (£212^2) are the graphic 
matroids on U{D). Observe that 

A{D+) = l+J \n{v) and A{D-) = |+J Out(v). 

v(iV{D+) v(iV(D7) 

Thus the arcs of Df can be partitioned into sets of in-arcs and similarly the arcs of Dj into 
sets of out-arcs. The other two matroids are the following partition matroids M3 = {E^^T^), 
M4 = (i?4,Z4), where 

X3 = {/ I / C A{Dt), for every v G V{Dt) = V{D), \I n \n{v)\ < 1}, 

and 

2:4 = 1/ I /C A(Z)7), for every uG V{D-) = V{D),\I nOut{v)\ < 1}. 

We define the matroid M = {E,I) as the direct sum M = Mi M2 ® M3 M4. Since each 



of Mi is a representable matroids over the same field (by Propositions 2.2 and 2.3), we have 



that M is also representable (Proposition 2.1 ). The reason we say that Mj is representable over 
the same field F is that the graphic matroid is representable over any field and the partition 
matroids defined here are representable over a finite field of size rp^"^' . So if we take F as a 
finite field of size rr^'^> then M is representable over F. The rank of this matroid is An — 4. 

Let us note that for each arc e G A{D) which is not incident with s, we have four elements in 
the matroid M, corresponding to the copies of e in Mj, i G {1, . . . , 4}. We denote these elements 
by ej, i G {1, . . . ,4}. For every edge e G A{D) incident with s, we have three corresponding 
elements. We denote them by 61,62,63, or 61,62,64, depending on the case when e is in- or 
out-arc for s. 

For i G {1, . . . , n — 1}, we define 

B'^' = [w\w eZ, |1^| = 4i, V 6 G A{D) either W n {ei, 62, 63, 64} = or {61, 63, 63, 64} C VF}. 

For W £ I, hy A\y we denote the set of arcs e G A{D) such that {61,62,63,64} r\W ^ %. 
Now we are ready to state the lemma that relates representative sets and the Minimum SCSS 
problem. 

Lemma 5.3. Let D he a strong digraph on n vertices and i < n — 2 be a natural number. Then 
there exists a strong spanning subdigraph D' of D with at most 2n — 2 — i arcs if and only if 
there exists a set F G B ^^ep ^ such that D has a strong spanning subdigraph D with 
Ap C A{D). Here, n' = An-A. 

Proof. We only show the forward direction of the proof, the reverse direction is straightfor- 
ward. Let D' be a strong spanning subdigraph of D with at most 2n — 2 — i arcs. Thus, by 



Proposition 5.2 we have that for any vertex v G V{D'), there exists an out-branching B^ and an 
in-branching B' in D' such that \A{B+)nA{B~)\ > i. Observe that the arcs in A{B:^)nA{B~) 
form an out-forest (in-forest). Let F' be an arbitrary subset of A{B^) n A(B~) containing ex- 
actly i arcs. Take X = A{B+) \ F' and Y = A{B~) \ F' . Observe that X and Y need not be 
disjoint. Clearly, \X\ = \Y\ = n — 1 — i. 
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In the matroid M, one can associate with D' an independent set //)' of size 4n — 4 as follows: 
Id' = U {ei, 62, 63, 64} IJ {ei, 63} |J {62, 64}. 

eeF' eeX egy 

By our construction, we have that Id' is an independent set in X and \Id'\ = 4£+4(n— 1— £) = n'. 
Let F = UeeF'{6i)62,e3,e4}, X = Ueexi^ii^s} ^^^l Y = Ueey{^2,64}. Then notice that 
F G ^""^ and F C Id'- This impUes that there exists a set F G B'^^ ^"ep''^ ^^'^ such that 
/^ = FUXuyGX. We show that D has a strong spanning subdigraph D with Ac^ C A(I?). 
Let Z) be the digraph with the vertex set V{D) and the arc set A(D) = X UY U Ac^. Consider 
the following four sets. 

1. Let Wi = {ei \ e G XuA^} then we have that Wi C I^ and thus Wi G Xi. This together 
with the fact that \Wi\ = n — 1 implies that X U Ac; forms a spanning tree in U{D). 

2. Let W2 = {62 I 6 G y U Ac}. Similar to the first case, then Y U A<c forms a spanning tree 
in U{D). 

3. Let VF3 = {63 I 6 G X[J A-^} then we have that W^ C /^ and thus VF3 G I3. This together 
with the fact that \Wi\ = \W2,\ = n — 1 and that X U A-c is a a spanning tree in U{D) 
implies that X U Ac, forms an out-branching rooted at s in Df. 

4. Let Wa = {63 I 6 G y U Ac\. Similar to the previous case, then Y U Ac, forms an 
in-branching rooted at s in D~ . 

We have shown that D contains Ac, and has an out- branching and in-branching rooted at s. 
This implies that D is the desired strong spanning subdigraph of D containing a set from B'^^. 
This concludes the proof of the lemma. D 

Lemma 5.4. Let D he a strong digraph on n vertices and i < n — 2 be a natural number. 
Then in time O (maXj^r^i (^J mn^logn) we can compute B^^ ^"gp^^ B^^ of size Q^. Here, 
n' = 4n- 4. 

Proof. We describe a dynamic programming based algorithm. Let D be an array of size i. 
The entry T)[i] will store the family B'^^ ^"gp^* B . We fill the entries in the array V in the 
increasing order of its index, that is, from 0, . . . ,£. For the base case define B^ = {0} and let 
W = {{61, 62, 63, 64}! e G A{D)}. Given that ^[i] is filled for ah i' < i, we fih ^[i + 1] as follows. 
Define AA4(*+i) = (b"^' •w)nl. 



Claim 5.2. For all < i < i - 1, M^^'+^'^ ^rep'^^'^^^ 64(i+i)_ 

Proof Let S G ^^(i+i) ^nd Y be a set of size n' - 4(i + 1) such that S nY = Iji and S UY £ I. 
We will show that there exists a set S" G AA^'^^+i) such that 5 n F = and SUY €l. This will 
imply the desired result. 

Let e G A{D) such that {61,62,63,64} C S. Define S* = S \ {61,62,63,64} and Y* = 
y U {ei, 62, 63, 64}. Since S" U y G X we have that S* £ T and Y* G T. Observe that S* G B'^^, 
S*UY* el and the size of Y* is n'-4i. This implies that there exists S* in B'^^ ^r'ep"^^ B"^^ such 
that S*UY* G X. Thus -S*U{ei, 62, 63, 64} G X and also in B'^^ -VF and thus in AA^^^+i). Taking 
S = S* L) {ei, 62, 63, 64} suffices for our purpose. This completes the proof of the claim. D 
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We fill the entry for V[i + 1] as follows. Observe that Mtt^ = {V[i,w\ •W)r\Z. We 
already have computed the family corresponding to 2?[i]. By Theorem 1 \B'^^\ < (^J and thus 
|^4(i+i)| ^ 4m(2.). Furthermore, we can compute J\f'^y^~^^) in time O (mn(2.)]. Now using 

Theorem (ll we can compute A/'^(*+i) c"ep ^^*"^^^ AA^(»+i) in time Trmit, 4i + 4, n' - 4{i + 1)), 
where t = 4r?T.("j. 



By Claim y we know that Af^^'+^^ C^rep^^''^^'' i3^(*+i). Thus Lemma [s^ implies that 
j^4(i+i) ^ gA{i+i) Cr'ep^'^'^^^ ^4(*+i)_ ^^ ^^ggjg^^ ^j^jg f^^j^j^y ^Q p[- ^ ^]_ rpj^jg completes the 

description and the correctness of the dynamic programming. The field size for uniform matroids 
are upper bounded by n '" -* and thus we can perform all the field operations in time ©(n^ log n). 
Thus, the running time of this algorithm is upper bounded by 

,4i,n — 4i\ = O I max mn log n . 

J J V^^M W J 

This completes the proof. D 

Lemma 5.5. Minimum SCSS can be solved in time C'(2'''^"m^n). 




Proof. Let us fix n' = 4n — 4. Proposition 5.2 implies that the Minimum SCSS problem 



is equivalent to finding, for an arbitrary vertex s £ V{D), an out-branching B^ and an in- 
branching B~ that maximizes |yl(i?+) fl A{B~)\. We guess the value of \A{B:^) n A{B~)\ 
and let this be £. By Lemma |5.3[ there exists a strong spanning subdigraph D' of D with at 
most 2n — 2 — ^ arcs if and only if there exists a set F G B '^rep ^ such that D has a 
strong spanning subdigraph D with A-^ C A{D). Recall that for X G X, by Ax we denote 



the set of arcs e G A{D) such that {ei, 62, 63, 64} H X 7^ 0. Now using Lemma 5.4 we compute 
^■*^ ^rep^^ '^'^^ ™ t™6 O (maxjg[^] Q.) mra^lognj. 

For every F G ;B^^ we test whether A-^ can be extended to an out- branching in Df and to 
an in-branching in Dj . We can do it in 0{n{n + ?Ti))-time by putting weights to the arcs of 
A<^ and weights 1 to all remaining arcs and then by running the classical algorithm of Edmonds 
|16) . Since I < n — 2, the running time of this algorithm is upper bounded by C'(2'^'^"^m^n). 



This concludes the proof. D 

Finally, we are ready to prove the main result of this section 
Theorem 7. Minimum Equivalent Graph can be solved in time C'(2^'^"m^n). 
Proof. Given an arbitrary digraph D we first find its strongly connected components Ci, . . . ,Cs. 



Now on each C,, we apply Lemma 5.5 and obtain a minimum equivalent subdigraph C'^. After 



this we apply Proposition 5.1 and obtain a minimum equivalent subdigraph of D. Since all the 



steps except Lemma 5.5 takes polynomial time we get the desired running time. This completes 



the proof. D 

A weighted variant of Minimum Equivalent Graph has also been studied in Hterature. 
More precisely the problem is defined as follows. 



Minimum Weight Equivalent Graph (MWEG) 

Input: A directed graph D and a weight function w : A{D) — ^ N. 

Task: Find a minimum weight equivalent subdigraph of D. 
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MWEG can be solved along the same line as MEG but to do this we need to use the notion 
of min g'-representative family and use Theorem [3] instead of Theorem [TJ These changes give us 
the following theorem. 

Theorem 8. Minimum Weight Equivalent Graph can he solved in time 0{2'^'^'^m?"n). 

5.3 Dynamic Programming over graphs of bounded treewidth 

In this section we discuss deterministic algorithms for "connectivity problems" such as Hamil- 
TONiAN Path, Steiner Tree, Feedback Vertex Set parameterized by the treewidth of 
the input graph. The algorithms are based on Theorem [T] and use graphic matroids to take care 
of connectivity constraints. The approach is generic and can be used whenever all the relevant 
information about a "partial solution" can be encoded as an independent set of a specific linear 
matroid. We exemplify the approach on the Steiner Tree problem. 



Steiner Tree 

Input: An undirected graph G with a set of terminals T C V[G), and a weight 

function w : E{G) -^ N. 
Task: Find a subtree in G of minimum weight spanning all vertices of T. 



5.3.1 Treewidth 

Let G be a graph. A tree-decomposition of a graph G is a pair (T, A:" = {Xt}tev(t)) such that 

• for every edge xy G E{G) there is a t G V{T) such that {x,y} C Xj, and 

• for every vertex v G V^G) the subgraph of T induced by the set {t \ v £ Xt} is connected. 

The width of a tree decomposition is maxjgy (j) | X^ | — 1 and the treewidth of G is the minimum 
width over all tree decompositions of G and is denoted by t'w(G). 

A tree decomposition (T, A') is called a nice tree decomposition if T is a tree rooted at some 
node r where Xr = 0, each node of T has at most two children, and each node is of one of the 
following kinds: 

1. Introduce node: a node t that has only one child t' where Xt D Xf and \Xt\ = \Xt'\ +1. 

2. Forget node: a node t that has only one child t' where Xt C Xf and \Xt\ = \Xt'\ — 1. 

3. Join node: a node t with two children ti and t2 such that Xt = Xt^ = Xt2- 

4. Base node: a node t that is a leaf of T, is different than the root, and Xt = 0. 

Notice that, according to the above definition, the root r of T is either a forget node or a join 
node. It is well known that any tree decomposition of G can be transformed into a nice tree 
decomposition maintaining the same width in linear time |24|. We use Gt to denote the graph 
induced by the vertex set UfXf, where t' ranges over all descendants of t, including t. By E(Xt) 
we denote the edges present in G[X(]. We use Ht to denote the graph on vertex set V{Gt) and 
the edge set E{Gt) \ E{Xt). For clarity of presentation we use the term nodes to refer to the 
vertices of the tree T. 
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5.3.2 Steiner Tree parameterized by treewidth 

Let G be an input graph of the Steiner Tree problem. Throughout this section, we say that 
E' C E{G) is a solution if the subgraph induced on this edge set is connected and it contains 
ah the terminal vertices. We call E' C E{G) an optimal solution if E' is a solution of the 
minimum weight. Let S^ he & family of edge subsets such that every edge subset corresponds 
to an optimal solution. That is, 

S^ = {E' C E{G) I E' is an optimal solution}. 

We start with few definitions that will be useful in explaining the algorithm. Let (T, X) be a 
tree decomposition of G of width tw. Let i be a node of y(T). By St we denote the family of 
edge subsets of E(Ht), {E' C E{Ht)}, that satisfies the following properties. 

• Either E' is a solution (that is, the subgraph induced on this edge set is connected and it 
contains all the terminal vertices); or 

• every vertex of (T n V{Gt)) \ Xt is incident with some edge from £", and every connected 
component of the graph induced by E' contains a vertex from Xt . 

We call St a family of partial solutions foe t. We denote by K^ a complete graph on the 
vertex set Xt- For an edge subset E* C E{G) and bag Xt corresponding to a node t, we define 
the following. 

1. Set d\E*) = Xj n V{E*), the set of endpoints of E* in Xt. 

2. Let G* be the subgraph of G on the vertex set V[G) and the edge set E* . Let C[, . . . , C^ 
be the connected components of G* such that for all z S [^], C,' nXj 7^ 0. Let Cj = G[r\Xt. 
Observe that Ci, . . . , C^ is a partition of d^{E*). By F(E*) we denote a forest {Qi, . . . , Qi} 
where each Qi is an arbitrary spanning tree of K*[Ci]. For an example, since K^'lCi] is a 
complete graph we could take Qi as a star. The purpose of F{E*) is to keep track for the 
vertices in Cj whether they were in the same connected component of G* . 

3. We define w{F{E*)) = w{E*). 

Our description of the algorithm slightly deviates from the usual table look-up based expo- 
sitions of dynamic programming algorithms on graphs of bounded treewidth. With every node 
t of T, we associate a subgraph of G. In our case it will be Ht. For every node t, rather than 
keeping a table, we keep a family of partial solutions for the graph Ht. That is, for every optimal 
solution L G ^ and its intersection Lt = E{Ht) n L with the graph Ht, we have some partial 
solution in the family that is "as good as L(". More precisely, we have some partial solution, 
say Lt in our family such that Lt U L/j is also an optimum solution for the whole graph. Here, 
Lr = L\ Lt. As we move from one node t in the decomposition tree to the next node t' the 
graph Ht changes to Hf, and so does the set of partial solutions. The algorithm updates its 
set of partial solutions accordingly. Here matroids come into play: in order to bound the size 
of the family of partial solutions that the algorithm stores at each node we employ Theorem |3] 
for graphic matroids. More details are given in the proof of the following theorem, which is the 
main result of this section. 

Theorem 9. Let G be an n-vertex graph given together with its tree decomposition of with tw. 
Then Steiner Tree on G can he solved in time 0{(l + 2'^+^)*'*^tw'^(^)n). 
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Proof. We first outline an algorithm with running time 0{{1 + 2'^+^)*^t'w'-''^'n^) for a simple 
exposition. Later we point out how we can remove the extra factor of n at the cost of a factor 
polynomial in tw. 

For every node t of T and subset Z <^ Xt, we store a family of edge subsets St[Z] of Ht 
satisfying the following correctness invariant. 

Correctness Invariant: For every L ^ y we have the following. Let Lt = E{Ht)r\ 
L, Ln = L\ Lt, and Z = d^{L). Then there exists Lt E St[Z] such that w{Lt) < 
'w{Lt), L = LtU Ln is a solution, and 5*(L) = Z. Observe that since w{Lt) < w{Lt) 
and L £ J^, we have that L £ 5^ . 

We process the nodes of the tree T from base nodes to the root node while doing the dynamic 
programming. Throughout the process we maintain the correctness invariant, which will prove 
the correctness of the algorithm. However, our main idea is to use representative sets to obtain 
St\Z\ of small size. That is, given the set St\Z\ that satisfies the correctness invariant, we use 
Theorem hi to obtain a subset S'^Z\ of St\_Z\ that also satisfies the correctness invariant and has 
size upper bounded by 2l^L Thus, we maintain the following size invariant. 

Size Invariant: After node t of T is processed by the algorithm, for every Z C Xj 
we have that \St\Z\\ < 21^1 

The new ingredient of the dynamic programming algorithm for Steiner Tree is the use 
of Theorem hi to compute St\Z\ maintaining the size invariant. The next lemma shows how to 
implement it. 

Lemma 5.6 (Shrinking Lemma). Let t he a node ofT, and let Z Q Xt be a set of size k. 
Furthermore, let St[Z] be a family of edge subsets of Ht satisfying the correctness invariant. 
If \St[Z]\ = i, then in time O (2^^'^~^^ k^^^H ■ n) we can compute S't[Z] C St[Z] satisfying 
correctness and size invariants. 

Proof. We start by associating a matroid with node t and the set Z C Xt as follows. We 
consider a graphic matroid M = {E,Z) on A'*[Z]. Here, the element set E of the matroid is the 
edge set E{K^[Z\) and the family of independent sets X consists of spanning forests of K^[Z]. 
Let St[Z\ = {E{,...,E\] and let A( = {F{E{), ..., F{E\)] be the set of forests in K\Z\ 
corresponding to the edge subsets in 5t[Z]. For i G {1, . . . ,fc — 1}, let A/i be the family of 
forests of N with i edges. For each of family A/i we apply Theorem [3] and compute its min 
(A; — 1 — i)-representative. That is, 

a; C^-1-* Mr. 

Let S't[Z] C St[Z] be such that for every Ej G S't[Z] we have that F{EJ) G U.tTi^A/;. By 

Theorem 3 |5j[Z]| < J2iZi (j) ^ 2^^. Now we show that the St[Z] maintains the correctness 
invariant. 

Let L G ^ and let Lt = E{Ht) n L, Lr = L \ Lt and Z = d\L). Then there exists 
E^A G St{Z\ such that w{E^A < w(Lt), L = E*- U Lr is an optimal solution and 9*(Z) = Z. 
Consider the forest F{Ej). Suppose its size is i, then F{Ej) G A/i. Now let F(Lr) he the 
forest corresponding to Lr with respect to the bag Xt. Since L is a solution, we have that 
F{EJ) U F{Lr) is a spanning tree in K^[Z]. Since Ai Qminrep A/i, we have that there exists a 

forest F{ED G Mi such that w{F{El)) < w{F{EJ)) and F{El) U F{Lr) is a spanning tree in 
K^[Z]. Thus, we know that Ej^ U Lr is an optimum solution and Ej^ G S'^Z\. This proves that 
S'^Z\ maintains the invariant. 
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The running time to compute St[Z] is dominated by: 

For a given edge set we also need to compute the forest and that can take 0{n) time. D 

In our algorithm the size of St[Z] can grow larger than 2' ' in i ntermediate steps but it will 



be at most 4l^l and thus we can use Shrinking Lemma (Lemma 5.6) to reduce its size efficiently. 
We now return to the dynamic programming algorithm over the tree-decomposition (T, X) 
of G and prove that it maintains the correctness invariant. We assume that (T, ^) is a nice 
tree-decomposition of G. By St we denote U^cXt'5t[Z] (also called a representative family of 
partial solutions). We show how St is obtained by doing dynamic programming from base node 
to the root node. 

Base node t. Here the graph Ht is empty and thus we take St = 0. 

Introduce node t ■with child t' . Here, we know that Xt D Xt' and \Xt\ = \Xti\ + 1. Let 
V be the vertex in Xt \ Xf. Furthermore observe that E{Ht) = E(Ht') and v is degree zero 
vertex in Ht. Thus the graph Ht only differs from Ht' at a isolated vertex v. Since we have not 
added any edge to the new graph, the family of solutions, which contains edge-subsets, does 
not change. Thus, we take St = St'. Formally, we take St[Z] = St'[Z \ {v}]. Since, Ht and Ht' 
have same set of edges the invariant is vacuously maintained. 

Forget node t vi^ith child t' . Here we know Xt C Xf and \Xt\ = \Xt'\ — 1. Let v be the 
vertex in Xf \ Xt. Let £v[Z] denote the set of edges between v and the vertices in Z Q Xt. 
Observe that E{Ht) = E{Ht') U £v[Xt]. Before we define things formally, observe that in this 
step the graphs Ht and Hf differ by at most t^v edges - the edges with one endpoint in v and 
the other in Xt. We go through every possible way an optimal solution can intersect with these 
newly added edges. The idea is that for every edge subset in our family of partial solutions 
we make several new partial solutions, one each for every subset of newly added edges. More 
formally the new set of partial solutions is defined as follows. Let 

St[Z]=St'[Z] U St'[Z\j{v}]oX. 

Recah that for two famifies A and B, we defined AoB = {A\J B : Ae A^B ^B}. 
Now we show that St maintains the invariant of the algorithm. Let L G 5^ . 

1. Let Lt = E{Ht) n L and Ln = L\ Lt. Furthermore, edges of Lt can be partitioned into 
Lf = E{Ht') n L and L^ = Lt\ Lf. That is, Lt = Lf tt) L^. 

2. Let Z = a*(L) and Z' = a*'(L). 

By the property of Sf , there exists a Lf G Sf \Z'\ such that 

L G =y ^^ Lf tt) L^, tt) Lr G =y 

^^ Lf ^L^^Lr^ y (5) 

and a*'(L) = d^'{Lf W L^ W Lr) = Z' . 

We put Lt = Lf U Ly and L = LtU Lr. We know show that Lt G 5t[Z]. Towards this just note 
that since Z' = Z oi Z' = Z U {v}, we have that St[Z] contains Sf[Z'] o Ly. By ([5|, L G o5^. 
Finally, we need to show that 9*(L) = Z. Towards this just note that 9*(L) = Z' \ {v} = Z. 
This concludes the proof for the fact that St maintains the correctness invariant. 
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Join node t ■with t'wo children ti and t2. Here, we know that Xt = Xt^ = Xf.^. Also we 
know that the edges of Ht is obtained by the union of edges of Ht^ and Ht2 which are disjoint. 
Of course they are separated by the vertices in Xt. A natural way to obtain a family of partial 
solutions for Ht is that we take the union of edges subsets of the families stored at nodes ti and 
^2- This is exactly what we do. Let 

St[z]=StAz]oSt,[z]. 

Now we show that St maintains the invariant. Let L £ 5^ . 

1. Let Lt = E(Ht)nL and Lr = L\Lt. Furthermore edges of Lt can be partitioned into those 
belonging to Ht^ and those belonging to Ht^. Let Ljj = E{Ht^)r\L and Lt^ = E{Ht2)r\L. 
Observe that since E[Ht^) n E(Ht2) = 0, we have that Lt^ H Lt2 = 0. Also observe that 
Lt = Lt^ ^Lt2. 

2. Let Z = d\L). Since Xt = Xt^ = Xt^ this implies that Z = d\L) = d^^{L) = d^^{L). 
Now observe that 

Ley ^^ Lt^ ^Lt^^Lae y 

<^=^ Lt^ l±) Lt2 W Lji £ y (by the property of St^ we have that Lj^ G St^ [Z]) 
<^=^ Lt-^ l±) Lt2 tt) Lji G y (by the property of 5^2 '^^ have that Lt^ S 5*2 i^] ) 

We put Lt = Lt^ U 1/^2 . By the definition of 5i[Z], we have that L*^ U Lij ^ ^[^]- The 
above inequalities also show that L = Lt U Lr G y . It remains to show that 9*(L) = Z. 
Since 5*^ (L) = Z, we have that 9*^ (Lt^ tt) -Li2 tt) L/j) = Z. Now since Xt^ = Xt^ we have that 
d^'^{Lt^^Lt2^LR) = Z and thus d^^^Lt^^SLt^^Lfi) = Z. Finally, because Xt^ = Xt,we conclude 
that d^{Lt^ tt) Lt2 tt) Lr) = 9*(L) = Z. This concludes the proof of correctness invariant. 

Root node r. Here, Xj. = 0. We go through all the solution in Sr[9] and output the one with 
the minimum weight. This concludes the description of the dynamic programming algorithm. 

Computation of St. Now we show how to implement the algorithm described above in the 
desired running time by making use of Lemma |5.6[ For our discussion let us fix a node t and 
■Z C X( of size k. While doing dynamic programming algorithm from the base nodes to the 
root node we always maintain the size invariant. That is, 5i[Z']| < 2 . 

Base node t. Trivially, in this case we have |5t[^]l ^ 2*^. 

Introduce node t ■with child t'. Here, we have that St[Z] = St'[Z \ {v}] and thus |5i[Z]| = 
\St'[Z\{v}]\ <2'=-i <2'=. 

Forget node t with child t'. In this case, 

St[Z]=St'[Z] U St'[Z U{v}]oX. 

XCS4Z] 



Observe that, 



\St[Z]\ = \St>[Z]\+ Y. \St'[ZU{v}]oX\<2' + J2i'')^'^' = ^('^')- 
XCS4Z] 4=1 \V 
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It can happen in this case that the s ize o f St[Z] is larger than 2 and thus we need to reduce 



the size of family. We apply Lemma 5.6 and obtain Sl[Z] that maintains the correctness and 
size invariants. We update St[Z] = S^[Z]. 

The running time to compute St (that is, across all subsets of Xt) is 



^tw+l 



of E (*'^.^M2*^"-'^4'-tw<^(i)n) =0((l+2-+i)*-.tw^Wn 



i=l 



Join node t ^vith t^vo children ti and t2. Here we defined 

St[Z]=St,[Z]oSt,[Z]. 



The size of St[Z] is 2-2 = 4 . Now, we apply Lemma 5.6 and obtain S^lZ] that maintains 
the correctness invariant and has size at most 2^. We put St[Z] = Sl.[Z]. 
The running time to compute St is 

O CyI (^^^ ^^'2— 1 • tw'^(i)n'j = O ((1 + 2-+1)*- • tw^(i)n) . 

Thus the whole algorithm takes O ( (1 + 2'^+-'^)*^ • t-w^^^' ■ n?) as the number of nodes in 
a nice tree-decomposition is upper bounded by 0{n). However, observe that we do not need 
to compute the forests and the associated weight at every step of the algorithm. The size of 
the forest is at most tw + 1 and we can maintain these forests across the bags during dynamic 
programming in time tw^^^ This will lead to an algorithm with the claimed running time. 
The last remark we would like to make is that one can do better at forget node by forgetting a 
single edge at a time. However, we did not try to optimize this, as the running time to compute 
the family of partial solutions at join node is the most expensive operation. This completes 
the proof. D 

The approach of Theorem [9] can be used to obtain single-exponential algorithms param- 
eterized by the treewidth of an input graph for many other connectivity problems such as 
Hamiltonian Cycle, Feedback Vertex Set, and Connected Dominated Set. For all 
these problems, checking whether two partial solutions can be glued together to form a global 
solution can be checked by testing independence in a specific graphic matroid. We believe 
that there exist interesting problems where this check corresponds to testing independence in a 
different class of linear matroids. 

5.4 Path, Trees and Subgraph Isomorprhism 

In this section we outline algorithms for /c-Path, fc-TREE and fe-SUBGRAPH Isomorphism using 
representative sets. All results in this section are based on computing representative families 
with respect to uniform matroids. 

5.4.1 A;-Path 

The problem we study in this section is as follows. 



/c-Path 






Parameter: 


k 


Input: 


An undirected n- vertex and 


m,-edge graph G and a positive 


integers k. 




Question: Does there exist a simple 


path of length k in G? 
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We start by modifying the graph shghtly. We add a new vertex, say s not present in V{G), 
to G by making it adjacent to every vertex in V{G). Let the modified graph be cahed G' . It is 
clear that G has a path of length k if and only if G' has a path of length k + 1 starting from 
s. For ease of presentation we rename G' to G and the objective is to find a path of length 
k + 1 starting from s. Let M = {E,I) be an uniform matroid Un,k+2 where E = V{G) and 
X = {S C V{G) \ \S\ < k + 2} . In this section whenever we speak about independent sets we 
mean independence with respect to the uniform matroid Un^k+2 defined above. For a given pair 
of vertices s,v £ V{G), recall that we defined 

Vly = IX X C V{G), v,s £ X, \X\ = i and there is a path from s to t; of length i 
in G with all the vertices belonging to X. > 




Now using Lemma 5.2 we compute "P^/^ '^rtp 'Psv''^ for aU ^ ^ ^{G) \ {s} in time 

0(2°('=)mlog2n max ""^"^ ' ^ + 2 

\ ie[k+2] 

The maximum is achieved at i = ak where a = 1 H — ~2e^- Thus, the running time of the 
algorithm is 0(2.8505*^ • 2°('=)mlog2 n) = 0(2.851'= • mlog^n). 

Furthermore, in the same time every set in V^^ can be ordered in a way that it corresponds 
to an undirected path in G. A graph G has a path of length k + 1 starting from s if and only 
if for some v € V{G) \ {s}, we have that V^^'^ ^ 0. Thus the running time of this algorithm is 
upper bounded by 0(2.851'= • mlog n). Let us remark that almost the same arguments show 
that the version of the problem on directed graphs is solvable within the same running time. 
However on undirected graphs we can speed up the algorithm slightly by using the following 
standard trick. We need the following result. 

Proposition 5.3 (l6j). There exists an algorithm, that given a graph G and an integer k, in 
time 0{k^n) either finds a simple path of length > k or computes a DPS (depth first search) 
tree rooted at some vertex of G of depth at most k. 



We first apply Proposition 5.3 and in time 0{k^n) either find a simple path of length > k 
in G or compute a DFS tree of G of depth at most k. In the former case we simply output the 
same path. In the later case since all the root to leaf paths are upper bounded by k and there 
are no cross edges in a DFS tree, we have that the number of edges in G is upper bounded by 
0{k'^n). Now on this G we apply the representative set based algorithm described above. This 
results in the following theorem. 

Theorem 10. A;-Path can he solved in time 0(2.851^= • nlog^n). 

5.4.2 A;-Tree and /c-Subgraph Isomorphism 
In this section we consider the following problem. 



fc-TREE 








Parameter: 


k 


Input: An 


undirected n-vertex. 


m-edge graph G and a tree T 


on 


k vertices. 




Question: 


Does G contains a subgraph isomorphic to T? 









We design an algorithm for A;-Tree using the method of representative sets. The algorithm 
for A:-Tree is more involved than for /c-Path. The reason to that is due to the fact that paths 
poses perfectly balanced separators of size one while trees not. We select a leaf r of T and root 
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the tree at r. For vertices x^y G V{T) we say that y < x \i x hes on the path from y to r in T 
(if X = r we also say that y < x). For a set C of vertices in T we will say that x <c U ^^ x < y 
and there is no z G C such that x < z and z < y. For a pair re, y of vertices such that y < x in 
T we define 

^x,^f0 ifxyeE{T), 

[The unique component C of T \ {x, y} such that N{C) = {x, y} otherwise. 

We also define T'^^ = TlC^^ U {u, v}]. We start by making a few simple observations about sets 
of vertices in trees. 

Lemma 5.7. For any tree T, a pair {x, y} of vertices in V(T) and integer c > 1 there exists 
a set W of vertices such that {x,y} C W, \W\ = 0{c) and every connected component U of 
T\W satisfies \U\ < ^^^ and \N{U)\ < 2. 

Proof. We first find a set W'^ of size at most c such that every connected component UofT\W 
satisfies \U\ < - — ^r^. Start with W^ = and select a lowermost vertex u S V{T) such that the 
subtree rooted at u has at least - — ^^ vertices. Add u to W^ and remove the subtree rooted at 

c 

u from T. The process must stop after c iterations since each iteration removes - — ^-^ vertices of 
T. Each component [/ of T \ W^ satisfies \U\ < - — ^^^ because (a) whenever a vertex u is added 
to W^, all components below u have size strictly less than - — ^^ and (b) when the process ends 

the subtree rooted at r has size at most \U\ < - — ^^^. Now, insert x and y into W^ as well. 

We build W from W^ by taking the least common ancestor closure of Wi; start with W = Wi 
and as long as there exist two vertices u and v in W such that their least common ancestor w 
is not in W, add w to W. Standard counting arguments on trees imply that this process will 
never increase the size of W by more than a factor 2, hence \W\ < 2\Wi\ = 0{c). 

We claim that every connected component U oi T \W satisfies N(U) < 2. Suppose not 
and let u be the vertex of u closest to the root. Since N(U) > 2 at least two vertices v and 
w in N{U) are descendants of u. Since U is connected v and w can't be descendants of each 
other, but then the least common ancestor of v and w is in U, contradicting the construction 
ofW. a 

Observation 5.1. For any tree T, set W C V{T) and component U of T \W such that 
\N{U)\ = 1, U contains a leaf ofT. 

Proof. T\U U N{U)] is a tree on at least two vertices and hence it has at least two leaves. At 
most one of these leaves is in N(U), the other one is also a leaf of T. D 

Lemma 5.8. Let W C V(T) be a set of vertices such that for every pair of vertices in W 
their least common ancestor is also in W. Let X be a set containing one leaf of T from each 
connected component U ofT\W such that |A^(^)| = 1. Then, for every connected component 
U such that \N{U)\ = 1 there exist x £ W, y £ X such that U = C^^ U {y}. For every other 
connected component U there exist x, y gW such that U = C^^ . 



Proof. It follows from the argument at the end of the proof of Lemma [5 . 7| that every component 
U oiT\W satisfies \N{U)\ < 2. If \N{U)\ = 2, let N{U) = {x,y}. We have that x < y or 
y < X since least common ancestor of x and y can not be in U and would therefore be in N{U), 
contradicting \N{U)\ = 2. Without loss of generaHty y < x. But then U = C^y. If N{U) = 1, 



let N{U) = {x}. By Observation 5.1 U contains a leaf y of T. Then [/ = C^^ U {y}. D 
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Given two graphs F and H, a graph homomorphism from F to H \s a, map / from V{F) to 
V{H), that is / : V{F) -^ V{H), such that ifuv e E{F), then f{u)f{v) G ^(ii'). Furthermore, 
when the map / is injective and |y(-ff)| = |y(-F)|, / is called a subgraph isomorphism. For every 
X, y G V{T) such that y < x, and every n,r; in V{G) we define 

•>^^if = {^e( IC^T'^ ) ■ 3 subgraph isomorphism / 

from T^^ to G[F U {u, f }] such that f{x) = u and f{y) = v> 

For every x,y & V{T) such that y < x, and every li in F(G) we define 

-pxy _ 



v&V{G)\{u} 



(6) 



In order to solve the problem it is sufficient to select an arbitrary leaf (. oit and determine 
whether there exists a u S y{G) such that the family J^^f is non-empty. We show that the 
collections of families {J'^v} ^'^^ {^u*} satisfy a recurrence relation. We will then exploit this 
recurrence relation to get a fast algorithm for /c-Tree. 

Lemma 5.9. For every x,y E V{T) such that y < x, every W = W L) {x,y} where W C C^^ , 
such that for every pair of vertices in W their least common ancestor is also in W, every 
X C C"" \ W such that X contains exactly one leaf of T in each connected component U of 
T^y \ W with \N{U)\ = 1, the following recurrence holds. 



■pxy 



u 



g:W^V{G) 
g{x)=uAg{y)=v 



( 

n^-^s(x')9{j/') 

x',y'<^W 
\w -w 



x'eiv , y'ex 



" —w 



\ 



J 



®g{w) 



Here the union goes over all 0{vW\) injective maps g from W to V{G) such that g{x) 
g{y) = V, and by g{W) we mean {g{c) : c £ W}. 



(7) 



u and 



Proof. For the C direction of the equality consider any subgraph isomorphism / from T^^ to 
V{G) such that f{x) = u and f{y) = v. Let g be the restriction of / to W. The map / 
can be considered as a collection of subgraph isomorphisms with one isomorphism for each 
x',y' £ W such that y' ^^ x from T^ ^ to G such that f{x') = g{x') and f{y') = g{y'), 
and one isomorphism for each x' E W ^ y' £ X such that y' :<^ x from T^ ^ to G such that 
f{x') = g{x'). Taking the union of the ranges of each of the small subgraph isomorphisms clearly 



give the range of /. Here we used Lemma 5.8 to argue that for every connected component U 
of T'^y \W we have that T[U U N{U)] is in fact on the form T^'y' for some x' , y' . 

For the reverse direction take any collection of subgraph isomorphisms with one isomorphism 
/ for each x',y' € W such that y' ^q^ x from T^^ to G such that f[x') = g{x') and f{y') = 
g{y'), and one isomorphism for each x' S W, y' G X such that y' ^^ x from T^ ^ to G such that 
f{x') = g{x'), such that the range of all of these subgraph isomorphisms are pairwise disjoint 
(except on vertices in W). Since all of these subgraph isomorphisms agree on the set W they 
can be glued together to a subgraph isomorphism from T^^ to G. D 



Our goal is to compute for every x,y £ ViT) such that y < x and u,v £ V{G) a family 

T^y and for every x,y £ V{T) such that y < x and u £ V{G) a 



j^^y such that j^^y cy^^'i 
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family ful such that fiZ* Qrep ^uv- We wih also maintain the invariant that \^^^\ < 

(|pt.|)2°Wlogn and |J-^.^| < (|^.J|^,)2°W logn. 

We first compute such families J-"^^/ for all x,y £ ViT) such that y < x and xy G E{T). 
Observe that in this case we have 



■pcy 

"-' 111) 



■{0} iiuveE{G), 
\iuviE{G). 



For each x,y £ V{T) such that y < x and xy G -E(T) and every u,v £ V{G) we set J^^|/ = J^^v- 
We can now for compute J^u* for every x,y £ V{T) such that y < x and xy G -E'er) and every 
u G ^(G) by applying Equation pj Clearly the computed families are within the required size 
bounds. 

We now show how to compute a family J^^^ of size {,^^y,)2°^'^' logn for every x,y G V{T) 

such that y < X and n, w G V{G) and |C^^| = t, assuming that the families T^^ and Tu* have 
been computed for every x,y G V{T) such that y < x and u,v G V{G) and |C^^| < t. We also 
assume that for each family J^i^^ that has been computed |^^f| < {<Qxy<)2°^^'logn. Similarly 

we assume that for each family J^u* that has been computed \J^u*\ < {\c^y\+i)'^ "^ ^'^S'it- 



We fix a constant c whose value will be decided later. First apply Lemma [57f| on T^'^, vertex 
pair {x, y} and constant c and obtain a set PV^ such that {x, y} '^ W and every connected 
component t/ of T \ 1?^ satisfies \U\ < i^^ and \N{U)\ < 2. Select a set X C F(T^'S') \l?^ such 
that each connected component U oi T\W with |-/V(t/)j = 1 contains exactly one leaf which is 
in X. Now, set W = W \ {x,y} and consider Equation [t] for J^^^ for this choice of x,y,W and 
X. Define 



pxy 



u 



g:VF-5>V'{G) 
g{x)=uAg{y)=v 



( 



\ 



a;',j/'eH> x'^W , y'&X 

\" -w " -w / 



®9{W) 



(8) 



Lemma 



5.9 



together with Lemmata 



3.2 



and |3.3| directly imply that T^l C^ep ^w- Fur 



thermore, each family on the right hand side of Equation [8] has already been computed, since 
Qx'y' (_ Qxy ^^^ gQ \C^'y'\ < t. For a fixed injective map g : W ^ ^(G) we define 



-pxy 

•^ a 



( 



\ 



nfx'y' 



n 



-F! 



9{x')g{y')' II -' 9{x')* 

x',y'£W x'ew , j/'ex 



'^ —w 



y'<^x' 
» — M^ 



e^W 



(9) 



/ 



It follows directly from the definition of J-!^^ and Tg"^ that 



-pxy 

*' 7/11 



u 



pxy 

J g . 



g:W^V(G) 
g{x)=uAg{y)=v 



Our goal is to compute a family J^^^ '^rep 



Til such that |J-if,?f| < 



Lemma 



3.1 



then implies that T^ '^rep Tuv- 



\\C^y\ 



|2°(^)log 



n. 



To that end, we define the function reduce. 



Given a family J- of sets of size p, the function reduce will run the algorithm of Theorem |2] on 
T and produce a family of size ( ) • 2°^ ' ■ log n that k — p represents J-" . 
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We will compute for each g : W ^ ^(C) such that g{x) = u and g{y) = v a family J^^^ of 
size at most {,fjly,)2°'-^^ logn such that JJ^ (^rep 

( \ 



-k \C-y\ j-^y^ ^^ ^.^^ ^-^^^ g^^ 



F^y = reduce 



u 



■pxy 
-' 9 



g:W^ViG) 
\g{x)=uAg{y)=v ) 



(10) 



To compute ^5^, inspect Equation 9 Equation 9 shows that T^'^ basically is a long chain of 



operations, specifically 

f^y = (A • ^2 . Fs . . . . F,) e g{W) (11) 

We define (and compute) T^^ as follows 

Tf> = reduce f reduce i. . . reduce f reduce {Px • F2) • F3) • . . •) • F^ ® g{W) (12) 



F^y Crev F^y and thus also Ff.^, '^rev F'fy Crec F^y, follows from Lemma 



^rep J g 

em ^ Sir 



uv ^rep 



uv ^rep 



3.3 



Theorem u\ Since the last operation we do in the construction of F!^^ is a call to reduce 



and 



^uv\ — (ic^i/ 1)2°^'^' logn follows from Theorem 



2 To conclude the computation we set 



T^xy 
•' u* 



reduce | (J F^^ {v} 

V^eV(G)\{«} 



(13) 



Lemma 



3.3 



and Theorem |2| imply that Fu* '^rep Fu* and that \Fu*\ < iic^vi- 






J 2° W logn. 

The algorithm computes the families Fu* and F^^ for every x,y £ V(T) such that y < x. 
It then selects an arbitrary leaf i oi T and checks whether there exists a it G ^(G) such that 
the family F^l is non-empty. Since F^l '^rep -^ul there is a non-empty F^l if and only if there 
is a non empty F^^ . Thus the algorithm can answer that there is a subgraph isomorphism from 
T to G if some F^l is non-empty, and that no such subgraph isomorphism exists otherwise. 

It remains to bound the running time of the algorithm. Up to polynomial factors, the 
running time of the algorithm is dominated by the computation of F^^. This computation 

consists of n *^l I) independent computations of the families F^^. Each computation of the 
family F^y consists of at most k repeated applications of the operation 



J^+^ = reduce(J"^. J"i 



i+l) 



Here F^ is a family of sets of size p, and so jj-"*! < ( )2°*^"logn. On the other hand F-^-i 



5.7 



to construct W. Thus |J^+i| < 



is a family of sets of size p' < - since we used Lemma 
(j,p2''(*=)logn < 2^'=logn. Thus \P • Fi+i\ < Q2^''+"'^^^log^n. Hence, when we apply Theo- 
rem M to compute reduce(^* • ^j+i), this takes time 



< 



k—p—p' 



2ek+o(k)^0{l) 




ek+oik)^Oil) 
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For p > jQ this is upper bounded by 2'^, regardless the choice of p'. For p < j^ we choose c 
such that 

f rC E rC . , C 

p < — < — < (k — p) , 

and then the running time of one reduce(J-'' • J^i+i) step is upper bounded by 

As the running time is dominated by n '' '^ = n^'^' such reduction steps and c = 0{-) the 
total running time of the algorithm is upper bounded by 

max I I I [J^]''''].2''''n^("^l 




p<k \ \pj \k — p 
It is possible to show that for a = 1 H — ~2e~^' 

^<k\\pj \k-pj J - \akj \l-aj 
This yields the following theorem. 

Theorem 11. For every e > 0, A;-Tree is solvable in time [^^) ij:^) -2 n *^e', where 

a = 1 H — ~ 2e ^ ■ ^^''" sufficiently small e this is upper bounded by 2.851 n^^' . 

The algorithm for /c-Tree can be generalized to /c-Subgraph Isomorphism for the case 
when the pattern graph F has treewidth at most t. Towards this we need a result analogous to 



Lemma 5.7 for trees, which can be proved using the separation properties of graphs of treewidth 



at most t. This will lead to an algorithm with running time 2.851 • n^^ ' . 

5.5 Other Apphcations 

Marx [33j gave algorithms for several problems based on matroid optimization. The main 
theorem in his work is Theorem 1.1 [33 on which most applications of |33] are based. The 
proof of the theorem uses an algorithm to find representative sets as a black box. Applying our 
algorithm (Theorem^ of this paper) instead gives an improved version of Theorem 1.1 of |33j. 

Proposition 5.4. Let M = {E,I) be a linear matroid where the ground set is partitioned into 
blocks of size £. Given a linear representation Am of M, it can be determined in 0{2^ \\Am\i^^') 
randomized time whether there is an independent set that is the union of k blocks. (\\Am\\ de- 
notes the length of Am in the input.) 

Finally, we mention another application from [33^ which we believe could be useful to obtain 
single exponential time parameterized and exact algorithms. 



^-Matroid Intersection Parameter: k 

Input: Let Mi = (E,Ii), . . . , Mi = (E,l£) be matroids on the same ground set E given 

by their representations Am^, • • • , Am^ over the same field F and a positive integer k. 
Question: Does there exist k element set that is independent in each Mi {X S XiPl. . .nX^)? 



Using Theorem 1.1 of [35], Marx [S3] gave a randomized algorithm for ^-Matroid Inter- 



section. By using Proposition 5.4 instead we get the following result. 

Proposition 5.5. ^-Matroid Intersection can be solved in 0{2'^'^^\\Am\\^'''^^) randomized 
time. 
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6 Conclusion 

In this paper, we gave fast algorithms computing representative famihes of independent sets in 
linear matroids. Moreover, we demonstrate that for families of uniform matroids even better 
algorithms are available. We also show interesting links between representative families of 
matroids and the design of single-exponential parameterized and exact exponential algorithms. 
We believe that these connections have a potential for a wide range of applications. 
The natural questions left open are the following. 

• What is the right time to compute a minimal g'-representative family for a family of 
independent sets of a linear matroid? Can it be computed in time linear in the size of the 
minimal ^-representative family or superlinear lower bounds can be found? 

• It would be interesting to find faster algorithms even for special classes of linear matroids. 

• Finally, the only matroids we used in our algorithmic applications were graphic, uniform, 
and partition matroids. It would be interesting to see what kind of applications can be 
handled by other type of matroids. 
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